New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PEGASUS-X #18551
PEGASUS-X #18551
Conversation
The documentation is not available anymore as the PR was closed or merged. |
On it 🤗 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work! Really enjoyed reviewing, the attention implementation is super clean IMO! Thanks a lot for this model addition ! A few nits here and there but it's great!
|
||
def test_seq_to_seq_generation(self): | ||
hf = PegasusXForConditionalGeneration.from_pretrained("zphang/pegasus-x-base").to(torch_device) | ||
tok = PegasusTokenizer.from_pretrained("google/pegasus-base") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be awesome to add the tokenizer files to the pegasus-x-base
repos to use the same model ID 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we do that after merging? My plan is merge implementation -> move weights to google/*
-> update model hub paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be simpler to do that in this PR, but no strong opinion as long as the follow up PR is opened !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, the weights are intended for the google/
org ultimately, if you can add me there that'd be great! Otherwise, I was going to have someone in the org download->reupload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @zphang,
Thanks a lot for this great model addition! Super cool to see that XPegasus outperforms LongT5 for base and large!
The PR looks to be in a great shape already - think there only some final clean-ups to do. If possible, it would be great if:
- we could potentially remove
DimensionInfo
- it does add a lot of "look-up time" when reading the code - Take another look at the slow generation test as I think some expected values are missing
Apart from this I mostly left nits. Very excited about this model, let me know if you need any help!
logger = logging.get_logger(__name__) | ||
|
||
PEGASUS_X_PRETRAINED_CONFIG_ARCHIVE_MAP = { | ||
"zphang/pegasus-x-base": "https://huggingface.co/zphang/pegasus-x-base/resolve/main/config.json", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd eventually change the weights organization to https://huggingface.co/nyu-mll to make it a bit more official maybe? @zphang can I add you to the nyu-mll
org?
) | ||
|
||
EXPECTED = [ | ||
"we investigate the performance of a new pretrained model for long input summarization. <n> the model" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't there be more expected outputs ? I don't really see how this test can pass at the moment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a single long string (one example).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see - good for me then! Guess it would have been nice to rename batch_input
to just input since it's just one example but not a big deal!
I can follow up on the rest of the feedback this weekend / early next week: most of it looks manageable. One comment on |
Thanks for the quick comment! For me it's mostly the single uppercase letters that I would like to change. Ok for me to keep the class, even if we haven't done it before for models like LongT5, Longformer or BigBird. Think overall I'd prefer to not have the class at all, but ok for me to leave it if you feel stongly about it @zphang :-) |
Let me know if there is anything else I need to address! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good for me!
The last thing for me would be to maybe move the checkpoints: https://huggingface.co/zphang/pegasus-x-base-arxiv under the NYU org:
https://huggingface.co/nyu-mll
@zphang think you're a member of the org so it should be as easy as a renaming operation under the settings of the model repo. Would that be ok for you? Usually models get more traction, visibility and last longer in terms of maintenance in orgs.
Also cc @LysandreJik
Let me ping Peter Liu on this. He should be able to pull and push to the Google org. I will update the paths in the PR when it is ready. |
Thanks for making the change! Test failures seem unrelated :-) Merging! |
Hi @zphang Thank you for adding this model! We have a few failing tests for this model, which could be found on this CI job run page. You can click [View raw logs] on the icon at the top-right corner.
Thank you in advance! |
Here the PR to correct the naming: https://github.com/huggingface/transformers/pull/18896/files |
Fix in #19025 |
* PegasusX Initial commit * rename * pegasus X implementation * pegx update * pegx fix * pegasus-x fixes * pegx updates * cleanup * cleanup * cleanup * tests * stylefixes * Documentation update * Model hub fix * cleanup * update * update * testfix * Check fix * tweaks for merging * style * style * updates for pr * style * change pegasus-x repo
Thanks for sharing this model @zphang! |
The FLAX weights of the fine-tuned models can be found here And the FLAX to HF conversion script can be found here I'll try to convert the models over and upload them to HF hub this week. |
What does this PR do?
Adds PEGASUS-X implementation.
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@patrickvonplaten, @patil-suraj
Note: The models are currently hosted on https://huggingface.co/zphang but should be transferred to the Google organization shortly.