Make redundant `features` argument optional for recurrent cells #3717

carlosgmartin · 2024-02-25T20:26:57Z

For recurrent cells such as the following:

the features argument of the constructor is redundant: It can be inferred from the carry input to its __call__ method. (The only cell that currently uses self.features in its __call__ method is ConvLSTMCell, which ought to be modified to infer it from its carry input.)

For each cell, the only place where self.features is needed is in the initialize_carry method. But in many models, the initial carry comes from "upstream" in the model, so this method is never used.

Proposal:

Edit ConvLSTMCell to infer features in its __call__ method from its carry input.
Set features=None by default in each cell's constructor.
Add the following line to each initialize_carry method:

assert self.features is not None, "features cannot be None when calling initialize_carry"

I can submit a PR for this, if desired.

An alternative would be to pass features directly to the initialize_carry method.

The text was updated successfully, but these errors were encountered:

cgarciae · 2024-03-06T18:02:16Z

This feature is needed for RNN I think. I think we added them in just for this 😅
Also, it feels more natural to specify hparams explicitly in the constructor.

carlosgmartin · 2024-03-08T01:09:03Z

@cgarciae Isn't shape inference from inputs, as done for the inputs argument, more in line with flax's init philosophy?

The RNN situation could be resolved as follows:

Add a features argument to the cell's initialize_carry method.
Add a features argument to RNN's constructor, and on this line, pass it to the self.cell.initialize_carry call.

That seems more natural and elegant to me, since the number of features may ultimately be determined by stuff upstream in the model (as opposed to being intrinsic to the cell itself).

cgarciae · 2024-04-18T13:37:21Z

What you are describing is how the Flax recurrent API was before, however it was a bit inconsistent e.g. some classes like ConvLSTM required passing the output features while others did not, and it also lacked some of the structure needed to implement the RNN class in simple terms. The solution was to add features to all RNN layers and slightly simplify initialize_carry.

carlosgmartin · 2024-04-18T16:44:07Z

@cgarciae Hmm, is there any reason ConvLSTM can't infer features from its inputs, like the other recurrent modules? I submitted a PR to address that here.

chiamp added the Priority: P2 - no schedule Best effort response and resolution. We have no plan to work on this at the moment. label Mar 19, 2024

carlosgmartin mentioned this issue Apr 16, 2024

Make ConvLSTMCell.__call__ infer the number of features from its input, like the other recurrent cells. #3862

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make redundant `features` argument optional for recurrent cells #3717

Make redundant `features` argument optional for recurrent cells #3717

carlosgmartin commented Feb 25, 2024 •

edited

cgarciae commented Mar 6, 2024

carlosgmartin commented Mar 8, 2024 •

edited

cgarciae commented Apr 18, 2024

carlosgmartin commented Apr 18, 2024

Make redundant features argument optional for recurrent cells #3717

Make redundant features argument optional for recurrent cells #3717

Comments

carlosgmartin commented Feb 25, 2024 • edited

cgarciae commented Mar 6, 2024

carlosgmartin commented Mar 8, 2024 • edited

cgarciae commented Apr 18, 2024

carlosgmartin commented Apr 18, 2024

Make redundant `features` argument optional for recurrent cells #3717

Make redundant `features` argument optional for recurrent cells #3717

carlosgmartin commented Feb 25, 2024 •

edited

carlosgmartin commented Mar 8, 2024 •

edited