Old: Theano Attention Parameters

This is for the Theano backend. The settings are not compatible with the TensorFlow backend. The settings are meant for the rec-layer.

Mandatory parameters

base: The layer that attention mechanism uses as its base. If you don't specify this parameter, encoder is taken as the base.
recurrent_transform: If this parameter is set to "attention_list", then attention is enabled for the layer.

Additional parameters

attention_template
Size of the template vector for attention.
Default: 128
attention_distance
Different types of possible distance functions to create the energy vector from previous state of the decoder and the final state of the encoder.
Possible values:
- "l2" : Euclidean distance
- "sqr" : Squared distance
- "dot" : Dot product
- "l1" : L1 norm
- "cos" : Cosine similarity
- "rnn" : Exponential Linear Units [https://arxiv.org/pdf/1511.07289v1.pdf]
Default : "l2"
attention_norm
Different types of possible normalizations that can be done to obtain the alpha weights for attention.
Possible values:
- "exp" : Exponential normalization
- "sigmoid" : Sigmoid normalization
- "lstm : Normalization with an LSTM
Default : "exp"
attention_sharpening
Degree by which you would like to sharpen/scale your attention weights. Default: 1.0
attention_nbest
Selects n highest alpha weights to enable attending to the corresponding states alone instead of attending to the entire sequence.
attention_glimpse
Number of glimpses into previous decoder states.
Default : 1

Provide feedback