Layers with recurrent hidden state

Basically all layers which define get_rec_initial_extra_outputs.

Most (all) of them only when inside a rec loop, otherwise they behave just as if you would iterate over the time axis.

WindowLayer: state [B,W,...]
CumsumLayer: state which is the output itself
RecLayer: like RnnCellLayer if unit is a str, or like subnet if unit is a subnet
RnnCellLayer: state depending on unit option, e.g. LSTMStateTuple, or also often the output itself
SelfAttentionLayer: k_left, v_left
KenLmStateLayer: state, step, scores
CumConcatLayer: state which is the output itself
SubnetworkLayer: any from subnet
BaseChoiceLayer, ChoiceLayer, DecideLayer, etc: choice_scores, choice_src_beams
EditDistanceTableLayer: state, optional source_len
MaskedComputationLayer: any from subnet, _output
UnmaskLayer: t
TwoDLSTMLayer: state, output, iteration

Provide feedback