tf.compat.v1.tpu.experimental.FtrlParameters

Optimization parameters for Ftrl with TPU embeddings.

tf.compat.v1.tpu.experimental.FtrlParameters(
    learning_rate: float,
    learning_rate_power: float = -0.5,
    initial_accumulator_value: float = 0.1,
    l1_regularization_strength: float = 0.0,
    l2_regularization_strength: float = 0.0,
    use_gradient_accumulation: bool = True,
    clip_weight_min: Optional[float] = None,
    clip_weight_max: Optional[float] = None,
    weight_decay_factor: Optional[float] = None,
    multiply_weight_decay_factor_by_learning_rate: Optional[bool] = None,
    multiply_linear_by_learning_rate: bool = False,
    beta: float = 0,
    allow_zero_accumulator: bool = False,
    clip_gradient_min: Optional[float] = None,
    clip_gradient_max: Optional[float] = None
)

Pass this to tf.estimator.tpu.experimental.EmbeddingConfigSpec via the optimization_parameters argument to set the optimizer and its parameters. See the documentation for tf.estimator.tpu.experimental.EmbeddingConfigSpec for more details.

estimator = tf.estimator.tpu.TPUEstimator(
    ...
    embedding_config_spec=tf.estimator.tpu.experimental.EmbeddingConfigSpec(
        ...
        optimization_parameters=tf.tpu.experimental.FtrlParameters(0.1),
        ...))

Args
`learning_rate`	a floating point value. The learning rate.
`learning_rate_power`	A float value, must be less or equal to zero. Controls how the learning rate decreases during training. Use zero for a fixed learning rate. See section 3.1 in the paper.
`initial_accumulator_value`	The starting value for accumulators. Only zero or positive values are allowed.
`l1_regularization_strength`	A float value, must be greater than or equal to zero.
`l2_regularization_strength`	A float value, must be greater than or equal to zero.
`use_gradient_accumulation`	setting this to `False` makes embedding gradients calculation less accurate but faster. Please see `optimization_parameters.proto` for details. for details.
`clip_weight_min`	the minimum value to clip by; None means -infinity.
`clip_weight_max`	the maximum value to clip by; None means +infinity.
`weight_decay_factor`	amount of weight decay to apply; None means that the weights are not decayed.
`multiply_weight_decay_factor_by_learning_rate`	if true, `weight_decay_factor` is multiplied by the current learning rate.
`multiply_linear_by_learning_rate`	When true, multiplies the usages of the linear slot in the weight update by the learning rate. This is useful when ramping up learning rate from 0 (which would normally produce NaNs).
`beta`	The beta parameter for FTRL.
`allow_zero_accumulator`	Changes the implementation of the square root to allow for the case of initial_accumulator_value being zero. This will cause a slight performance drop.
`clip_gradient_min`	the minimum value to clip by; None means -infinity.
`clip_gradient_max`	the maximum value to clip by; None means +infinity.

© 2020 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/versions/r2.4/api_docs/python/tf/compat/v1/tpu/experimental/FtrlParameters