tf.contrib.distributions.bijectors.masked_autoregressive_default_template
Build the Masked Autoregressive Density Estimator (Germain et al., 2015). (deprecated)
tf.contrib.distributions.bijectors.masked_autoregressive_default_template(
hidden_layers, shift_only=False, activation=tf.nn.relu, log_scale_min_clip=-5.0,
log_scale_max_clip=3.0, log_scale_clip_gradient=False, name=None, *args,
**kwargs
)
This will be wrapped in a make_template to ensure the variables are only created once. It takes the input and returns the loc ("mu" in [Germain et al. (2015)][1]) and log_scale ("alpha" in [Germain et al. (2015)][1]) from the MADE network.
About Hidden Layers
Each element of hidden_layers should be greater than the input_depth (i.e., input_depth = tf.shape(input)[-1] where input is the input to the neural network). This is necessary to ensure the autoregressivity property.
About Clipping
This function also optionally clips the log_scale (but possibly not its gradient). This is useful because if log_scale is too small/large it might underflow/overflow making it impossible for the MaskedAutoregressiveFlow bijector to implement a bijection. Additionally, the log_scale_clip_gradient bool indicates whether the gradient should also be clipped. The default does not clip the gradient; this is useful because it still provides gradient information (for fitting) yet solves the numerical stability problem. I.e., log_scale_clip_gradient = False means grad[exp(clip(x))] = grad[x] exp(clip(x)) rather than the usual grad[clip(x)] exp(clip(x)).
| Args | ||
|---|---|---|
hidden_layers | Python list-like of non-negative integer, scalars indicating the number of units in each hidden layer. Default: [512, 512]. </td> </tr><tr> <td>shift_only</td> <td> Pythonboolindicating if only theshiftterm shall be computed. Default:False. </td> </tr><tr> <td>activation</td> <td> Activation function (callable). Explicitly setting toNoneimplies a linear activation. </td> </tr><tr> <td>log_scale_min_clip</td> <td>float-like scalarTensor, or aTensorwith the same shape aslog_scale. The minimum value to clip by. Default: -5. </td> </tr><tr> <td>log_scale_max_clip</td> <td>float-like scalarTensor, or aTensorwith the same shape aslog_scale. The maximum value to clip by. Default: 3. </td> </tr><tr> <td>log_scale_clip_gradient</td> <td> Pythonboolindicating that the gradient of <a href="../../../../tf/clip_by_value"><code>tf.clip_by_value</code></a> should be preserved. Default:False. </td> </tr><tr> <td>name</td> <td> A name for ops managed by this function. Default: "masked_autoregressive_default_template". </td> </tr><tr> <td>args</td> <td> <a href="../../../../tf/layers/dense"><code>tf.compat.v1.layers.dense</code></a> arguments. </td> </tr><tr> <td>*kwargs` | tf.compat.v1.layers.dense keyword arguments. |
| Returns | |
|---|---|
shift | Float-like Tensor of shift terms (the "mu" in [Germain et al. (2015)][1]). |
log_scale | Float-like Tensor of log(scale) terms (the "alpha" in [Germain et al. (2015)][1]). |
| Raises | |
|---|---|
NotImplementedError | if rightmost dimension of inputs is unknown prior to graph execution. |
References
[1]: Mathieu Germain, Karol Gregor, Iain Murray, and Hugo Larochelle. MADE: Masked Autoencoder for Distribution Estimation. In International Conference on Machine Learning, 2015. https://arxiv.org/abs/1502.03509
© 2020 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/contrib/distributions/bijectors/masked_autoregressive_default_template