tf.contrib.seq2seq.AttentionWrapper

Wraps another RNNCell with attention.

Inherits From: RNNCell

tf.contrib.seq2seq.AttentionWrapper(
    cell, attention_mechanism, attention_layer_size=None, alignment_history=False,
    cell_input_fn=None, output_attention=True, initial_cell_state=None, name=None,
    attention_layer=None, attention_fn=None, dtype=None
)

Args
`cell`	An instance of `RNNCell`.
`attention_mechanism`	A list of `AttentionMechanism` instances or a single instance.
`attention_layer_size`	A list of Python integers or a single Python integer, the depth of the attention (output) layer(s). If None (default), use the context as attention at each time step. Otherwise, feed the context and cell output into the attention layer to generate attention at each time step. If attention_mechanism is a list, attention_layer_size must be a list of the same length. If attention_layer is set, this must be None. If attention_fn is set, it must guaranteed that the outputs of attention_fn also meet the above requirements.
`alignment_history`	Python boolean, whether to store alignment history from all time steps in the final output state (currently stored as a time major `TensorArray` on which you must call `stack()`).
`cell_input_fn`	(optional) A `callable`. The default is: `lambda inputs, attention: array_ops.concat([inputs, attention], -1)`.
`output_attention`	Python bool. If `True` (default), the output at each time step is the attention value. This is the behavior of Luong-style attention mechanisms. If `False`, the output at each time step is the output of `cell`. This is the behavior of Bhadanau-style attention mechanisms. In both cases, the `attention` tensor is propagated to the next time step via the state and is used there. This flag only controls whether the attention mechanism is propagated up to the next cell in an RNN stack or to the top RNN output.
`initial_cell_state`	The initial state value to use for the cell when the user calls `zero_state()`. Note that if this value is provided now, and the user uses a `batch_size` argument of `zero_state` which does not match the batch size of `initial_cell_state`, proper behavior is not guaranteed.
`name`	Name to use when creating ops.
`attention_layer`	A list of `tf.compat.v1.layers.Layer` instances or a single `tf.compat.v1.layers.Layer` instance taking the context and cell output as inputs to generate attention at each time step. If None (default), use the context as attention at each time step. If attention_mechanism is a list, attention_layer must be a list of the same length. If attention_layers_size is set, this must be None.
`attention_fn`	An optional callable function that allows users to provide their own customized attention function, which takes input (attention_mechanism, cell_output, attention_state, attention_layer) and outputs (attention, alignments, next_attention_state). If provided, the attention_layer_size should be the size of the outputs of attention_fn.
`dtype`	The cell dtype

Raises
`TypeError`	`attention_layer_size` is not None and (`attention_mechanism` is a list but `attention_layer_size` is not; or vice versa).
`ValueError`	if `attention_layer_size` is not None, `attention_mechanism` is a list, and its length does not match that of `attention_layer_size`; if `attention_layer_size` and `attention_layer` are set simultaneously.

Attributes
`graph`	DEPRECATED FUNCTION Warning: THIS FUNCTION IS DEPRECATED. It will be removed in a future version. Instructions for updating: Stop using this property because tf.layers layers no longer track their graph.
`output_size`	Integer or TensorShape: size of outputs produced by this cell.
`scope_name`
`state_size`	The `state_size` property of `AttentionWrapper`.

Methods

`get_initial_state`

View source

get_initial_state(
    inputs=None, batch_size=None, dtype=None
)

`zero_state`

View source

zero_state(
    batch_size, dtype
)

Return an initial (zero) state tuple for this AttentionWrapper.

Note: Please see the initializer documentation for details of how to call zero_state if using an AttentionWrapper with a BeamSearchDecoder.

Args
`batch_size`	`0D` integer tensor: the batch size.
`dtype`	The internal state data type.

Returns
An `AttentionWrapperState` tuple containing zeroed out tensors and, possibly, empty `TensorArray` objects.

Raises
`ValueError`	(or, possibly at runtime, InvalidArgument), if `batch_size` does not match the output size of the encoder passed to the wrapper object at initialization time.

© 2020 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/contrib/seq2seq/AttentionWrapper