tf.keras.optimizers.Ftrl

View source on GitHub

Optimizer that implements the FTRL algorithm.

Inherits From: Optimizer

View aliases

Main aliases

tf.optimizers.Ftrl

Compat aliases for migration

See Migration guide for more details.

tf.compat.v1.keras.optimizers.Ftrl

tf.keras.optimizers.Ftrl(
    learning_rate=0.001, learning_rate_power=-0.5, initial_accumulator_value=0.1,
    l1_regularization_strength=0.0, l2_regularization_strength=0.0,
    name='Ftrl', l2_shrinkage_regularization_strength=0.0, beta=0.0,
    **kwargs
)

See Algorithm 1 of this paper. This version has support for both online L2 (the L2 penalty given in the paper above) and shrinkage-type L2 (which is the addition of an L2 penalty to the loss function).

Initialization:

$$t = 0$$

$$n_{0} = 0$$

$$\sigma_{0} = 0$$

$$z_{0} = 0$$

Update (

$$i$$

is variable index,

$$\alpha$$

is the learning rate):

$$t = t + 1$$

$$n_{t,i} = n_{t-1,i} + g_{t,i}^{2}$$

$$\sigma_{t,i} = (\sqrt{n_{t,i} } - \sqrt{n_{t-1,i} }) / \alpha$$

$$z_{t,i} = z_{t-1,i} + g_{t,i} - \sigma_{t,i} * w_{t,i}$$

$$w_{t,i} = - ((\beta+\sqrt{n_{t,i} }) / \alpha + 2 * \lambda_{2})^{-1} * (z_{i} - sgn(z_{i}) * \lambda_{1}) if \abs{z_{i} } > \lambda_{i} else 0$$

Check the documentation for the l2_shrinkage_regularization_strength parameter for more details when shrinkage is enabled, in which case gradient is replaced with gradient_with_shrinkage.

Args
`learning_rate`	A `Tensor`, floating point value, or a schedule that is a `tf.keras.optimizers.schedules.LearningRateSchedule`. The learning rate.
`learning_rate_power`	A float value, must be less or equal to zero. Controls how the learning rate decreases during training. Use zero for a fixed learning rate.
`initial_accumulator_value`	The starting value for accumulators. Only zero or positive values are allowed.
`l1_regularization_strength`	A float value, must be greater than or equal to zero.
`l2_regularization_strength`	A float value, must be greater than or equal to zero.
`name`	Optional name prefix for the operations created when applying gradients. Defaults to `"Ftrl"`.
`l2_shrinkage_regularization_strength`	A float value, must be greater than or equal to zero. This differs from L2 above in that the L2 above is a stabilization penalty, whereas this L2 shrinkage is a magnitude penalty. When input is sparse shrinkage will only happen on the active weights.
`beta`	A float value, representing the beta value from the paper.
`**kwargs`	Keyword arguments. Allowed to be one of `"clipnorm"` or `"clipvalue"`. `"clipnorm"` (float) clips gradients by norm; `"clipvalue"` (float) clips gradients by value.

Reference:

paper

Raises
`ValueError`	in case of any invalid argument.

© 2020 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/versions/r2.4/api_docs/python/tf/keras/optimizers/Ftrl