Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEGASUS-X #18551

Merged
merged 25 commits into from Sep 2, 2022
Merged

PEGASUS-X #18551

merged 25 commits into from Sep 2, 2022

Conversation

zphang
Copy link
Contributor

@zphang zphang commented Aug 10, 2022

What does this PR do?

Adds PEGASUS-X implementation.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@patrickvonplaten, @patil-suraj


Note: The models are currently hosted on https://huggingface.co/zphang but should be transferred to the Google organization shortly.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Aug 10, 2022

The documentation is not available anymore as the PR was closed or merged.

@LysandreJik
Copy link
Member

cc @ArthurZucker

@ArthurZucker
Copy link
Collaborator

On it 🤗

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work! Really enjoyed reviewing, the attention implementation is super clean IMO! Thanks a lot for this model addition ! A few nits here and there but it's great!

docs/source/en/index.mdx Show resolved Hide resolved

def test_seq_to_seq_generation(self):
hf = PegasusXForConditionalGeneration.from_pretrained("zphang/pegasus-x-base").to(torch_device)
tok = PegasusTokenizer.from_pretrained("google/pegasus-base")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be awesome to add the tokenizer files to the pegasus-x-base repos to use the same model ID 😉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we do that after merging? My plan is merge implementation -> move weights to google/* -> update model hub paths.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be simpler to do that in this PR, but no strong opinion as long as the follow up PR is opened !

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, we could also move them to NYU if you want @zphang . Google works as well though. I can add you to the google org (or NYU org) if you want @zphang and you could upload the weights there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, the weights are intended for the google/ org ultimately, if you can add me there that'd be great! Otherwise, I was going to have someone in the org download->reupload.

tests/models/pegasus_x/test_modeling_pegasus_x.py Outdated Show resolved Hide resolved
tests/models/pegasus_x/test_modeling_pegasus_x.py Outdated Show resolved Hide resolved
Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @zphang,

Thanks a lot for this great model addition! Super cool to see that XPegasus outperforms LongT5 for base and large!

The PR looks to be in a great shape already - think there only some final clean-ups to do. If possible, it would be great if:

  • we could potentially remove DimensionInfo - it does add a lot of "look-up time" when reading the code
  • Take another look at the slow generation test as I think some expected values are missing

Apart from this I mostly left nits. Very excited about this model, let me know if you need any help!

src/transformers/models/pegasus_x/__init__.py Outdated Show resolved Hide resolved
logger = logging.get_logger(__name__)

PEGASUS_X_PRETRAINED_CONFIG_ARCHIVE_MAP = {
"zphang/pegasus-x-base": "https://huggingface.co/zphang/pegasus-x-base/resolve/main/config.json",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd eventually change the weights organization to https://huggingface.co/nyu-mll to make it a bit more official maybe? @zphang can I add you to the nyu-mll org?

src/transformers/models/pegasus_x/modeling_pegasus_x.py Outdated Show resolved Hide resolved
src/transformers/models/pegasus_x/modeling_pegasus_x.py Outdated Show resolved Hide resolved
src/transformers/models/pegasus_x/modeling_pegasus_x.py Outdated Show resolved Hide resolved
)

EXPECTED = [
"we investigate the performance of a new pretrained model for long input summarization. <n> the model"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be more expected outputs ? I don't really see how this test can pass at the moment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a single long string (one example).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see - good for me then! Guess it would have been nice to rename batch_input to just input since it's just one example but not a big deal!

@zphang
Copy link
Contributor Author

zphang commented Aug 18, 2022

I can follow up on the rest of the feedback this weekend / early next week: most of it looks manageable.

One comment on DimensionInfo: I use it to capture all of the shape-related attributes I need for the various reshapes/padding: it felt cleaner/more manageable for me to keep it all in one data structure than to pass them around individually. I can expand the attributes to the full readable names as you mention above, but I think it's useful to keep the dataclass. Let me know what you think: I'm fine either way.

@patrickvonplaten
Copy link
Contributor

I can follow up on the rest of the feedback this weekend / early next week: most of it looks manageable.

One comment on DimensionInfo: I use it to capture all of the shape-related attributes I need for the various reshapes/padding: it felt cleaner/more manageable for me to keep it all in one data structure than to pass them around individually. I can expand the attributes to the full readable names as you mention above, but I think it's useful to keep the dataclass. Let me know what you think: I'm fine either way.

Thanks for the quick comment! For me it's mostly the single uppercase letters that I would like to change. Ok for me to keep the class, even if we haven't done it before for models like LongT5, Longformer or BigBird. Think overall I'd prefer to not have the class at all, but ok for me to leave it if you feel stongly about it @zphang :-)
Just it'd be super nice to write out the single upper-case letters

@zphang
Copy link
Contributor Author

zphang commented Aug 30, 2022

Let me know if there is anything else I need to address!

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good for me!
The last thing for me would be to maybe move the checkpoints: https://huggingface.co/zphang/pegasus-x-base-arxiv under the NYU org:
https://huggingface.co/nyu-mll

@zphang think you're a member of the org so it should be as easy as a renaming operation under the settings of the model repo. Would that be ok for you? Usually models get more traction, visibility and last longer in terms of maintenance in orgs.

Also cc @LysandreJik

@zphang
Copy link
Contributor Author

zphang commented Aug 31, 2022

Let me ping Peter Liu on this. He should be able to pull and push to the Google org. I will update the paths in the PR when it is ready.

@patrickvonplaten
Copy link
Contributor

Thanks for making the change! Test failures seem unrelated :-) Merging!

@patrickvonplaten patrickvonplaten merged commit 53e33e6 into huggingface:main Sep 2, 2022
@ydshieh
Copy link
Collaborator

ydshieh commented Sep 5, 2022

Hi @zphang Thank you for adding this model! We have a few failing tests for this model, which could be found on this CI job run page. You can click [View raw logs] on the icon at the top-right corner.

  • One issue is the missing checkpoint pegasus-x-base:

    pegasus-x-base is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'

    Do you know where is the correct checkpoint?

  • Another test failure is test_seq_to_seq_generation, where the model outputs

    PEGASUSX PEGASUS PEGASUS PEGASUS-X PEGASUS PEGASUS-X PEGASUS PEGASUS-X PEGASUS PEGASUS PEGASUS
    

    Could you check if you get the expected values we investigate the performance on your side, and/or (if possible) why this non-sense output occurs?

  • For the remaining failure test_torchscript_output_attentions, we will fix it on our side.

Thank you in advance!

@patrickvonplaten
Copy link
Contributor

Here the PR to correct the naming: https://github.com/huggingface/transformers/pull/18896/files

@ydshieh
Copy link
Collaborator

ydshieh commented Sep 14, 2022

Fix in #19025

oneraghavan pushed a commit to oneraghavan/transformers that referenced this pull request Sep 26, 2022
* PegasusX Initial commit

* rename

* pegasus X implementation

* pegx update

* pegx fix

* pegasus-x fixes

* pegx updates

* cleanup

* cleanup

* cleanup

* tests

* stylefixes

* Documentation update

* Model hub fix

* cleanup

* update

* update

* testfix

* Check fix

* tweaks for merging

* style

* style

* updates for pr

* style

* change pegasus-x repo
@thefonseca
Copy link

Thanks for sharing this model @zphang!
Do you intend to release the fine-tuned checkpoints? (pubmed-large, arxiv-large, govreport-large, etc)?

@zphang
Copy link
Contributor Author

zphang commented Oct 31, 2022

The FLAX weights of the fine-tuned models can be found here
https://github.com/google-research/pegasus/tree/main/pegasus/flax#checkpoints

And the FLAX to HF conversion script can be found here
https://github.com/google-research/pegasus/blob/main/pegasus/flax/checkpoint_conversion/convert_from_flax_to_hf.py

I'll try to convert the models over and upload them to HF hub this week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants