New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate: deprecate generation relying on default max_length
#18018
Conversation
The documentation is not available anymore as the PR was closed or merged. |
max_length
in favor of max_new_tokens
max_length
in favor of max_new_tokens
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this. Made comments on the Flax file which apply to the others. I'll let @patrickvonplaten comment on whether we should remove this max_length
argument in v5 or not, but max_new_tokens
makes way more sense to me!
3d8d1f1
to
5e52144
Compare
Thanks a lot for opening this very-much needed PR @gante! Before moving forward, I'd like to discuss a bit if we should remove First, I want to lay out my opinion about First: My general opinion regarding
Now from this perspective, if somehow possible, I'd like to long-term remove the "fall-back" mechanism to the config generation arguments and delete the generation parameters from the config fully (I'd be happy with an optional generation config instead though). This would help us enormously to:
Second: Now from that perspective, I think we should not remove More specifically: IMO
However
So moving forward, I would propose to start with a warning if the user passes neither Would love to hear your (general) opinions here regarding |
I completely agree with the point of moving away from the config things that are not related to instantiating the model (like I've been doing for training parameters like gradient accumulation) and in general we might need some sort of "pipeline config" for the use of the model by default in a pipeline/widget, which could also be re-used for the parameters used by generate. Good for me to have the two arguments forever and ever while encouraging |
I also agree with the reasoning and the suggested path forward 👍 |
5e52144
to
da10d52
Compare
max_length
in favor of max_new_tokens
max_length
@patrickvonplaten @sgugger I've implemented the changes from the comments above 👍 Related to the
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made some comments that should be applied on all three frameworks.
LGTM. thanks for iterating on this!
Your comment @patrickvonplaten is exactly what I think ! Also, not sure about v5 but using neither Another way would be expressing it as a real python generator for instance. for new_token in model.generate(...):
#do something Again this is pretty theoretical, doesn't map really well to TF or Jax, but I think it's been proposed in other Issues, and is something to have in mind (probably to at least address this in docs and explain why you probably don't want to generate eternally :)) |
@Narsil that seems like an interesting use case 🤔 What would be the default arguments, if setting neither was a valid option? Perhaps a Stopping condition that would target this use case? From what I've seen, the default arguments are one of the biggest pain points for new users (and even some experienced users), so whatever decisions we make in this PR should also cover that! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work @gante ! Thanks a lot for transforming the discussion into an actionable PR! Everything looks good to me. Some nitpicks regarding the wording, but apart from this it looks all good!
What does this PR do?
(EDITED)
This PR applies the outcome of two discussions:
max_new_tokens
, as opposed tomax_length
. This introduces themax_new_tokens
to TF and FLAX.scores
, to be clear that its length depends onmax_new_tokens
and other factors.Review suggestion: review PT first. TF and Flax are mostly copy/paste.
Closes #17868