Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add X-CLIP #18852

Merged
merged 40 commits into from Sep 8, 2022
Merged

Add X-CLIP #18852

merged 40 commits into from Sep 8, 2022

Conversation

NielsRogge
Copy link
Contributor

@NielsRogge NielsRogge commented Sep 1, 2022

What does this PR do?

This PR adds X-CLIP, which is a minimal extension of CLIP for video-language pre-training.

To do:

  • upload all checkpoints to the hub, as part of the microsoft organization

@NielsRogge
Copy link
Contributor Author

Many tests fail due to the following error:

ModuleNotFoundError: No module named 'transformers.models.xclip'

This is probably because I first called the model folder "xclip", which is now called "x_clip". Still, wondering why it keeps looking for the module models.clip. If anyone has any pointers, that would be greatly appreciated.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this new model! Left a couple of comments.

For the import errors, I think you need to add "xclip" to the SPECIAL_MODEL_TYPE_TO_MODULE_NAME variable in configuration_auto.py since the module name is not the model type xclip (with potential - replaced by _).

src/transformers/__init__.py Outdated Show resolved Hide resolved
src/transformers/models/auto/tokenization_auto.py Outdated Show resolved Hide resolved
src/transformers/models/x_clip/modeling_x_clip.py Outdated Show resolved Hide resolved
src/transformers/models/x_clip/modeling_x_clip.py Outdated Show resolved Hide resolved
src/transformers/models/x_clip/test.py Outdated Show resolved Hide resolved
tests/models/x_clip/test_modeling_x_clip.py Outdated Show resolved Hide resolved
tests/models/x_clip/test_modeling_x_clip.py Outdated Show resolved Hide resolved
utils/check_config_docstrings.py Outdated Show resolved Hide resolved
utils/check_repo.py Outdated Show resolved Hide resolved
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Sep 2, 2022

The documentation is not available anymore as the PR was closed or merged.

@NielsRogge
Copy link
Contributor Author

@sgugger thanks a lot, that solved the issue. There seems to be another (small) issue with run_tests_hub:

==================================== ERRORS ====================================
_______________ ERROR collecting tests/utils/test_file_utils.py ________________
tests/utils/test_file_utils.py:26: in <module>
    from transformers import *  # noqa F406
src/transformers/utils/import_utils.py:1021: in __getattr__
    value = getattr(module, name)
src/transformers/utils/import_utils.py:1023: in __getattr__
    raise AttributeError(f"module {self.__name__} has no attribute {name}")
E   AttributeError: module transformers.models.clip has no attribute CLIPProcessor

Running RUN_SLOW=yes pytest tests/utils/test_file_utils.py passes locally for me.

@sgugger
Copy link
Collaborator

sgugger commented Sep 2, 2022

That would be because you moved CLIPProcessor in the non-vision dependent objects in the main init (and rightly so) but did not do the same for the models/clip/__init__.py.

Copy link
Contributor

@alaradirik alaradirik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding this! Looks good to me overall, I just left a few comments and questions.

tests/models/x_clip/test_modeling_x_clip.py Show resolved Hide resolved
src/transformers/models/x_clip/modeling_x_clip.py Outdated Show resolved Hide resolved
src/transformers/models/x_clip/modeling_x_clip.py Outdated Show resolved Hide resolved

hidden_states = torch.cat([hidden_states, msg_token], dim=1)

residual = hidden_states
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just double checking, shouldn't this be residual = hidden_states.clone() instead?

It seems lines 449-462 would alter residual too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm this seems to work fine; is it possible that residual just refers to the original hidden states?

Just did a quick experiment:

>>> a = "hello"
>>> b = a
>>> a += "niels"
>>> b
'hello'

src/transformers/models/x_clip/modeling_x_clip.py Outdated Show resolved Hide resolved
src/transformers/models/x_clip/modeling_x_clip.py Outdated Show resolved Hide resolved
src/transformers/models/x_clip/modeling_x_clip.py Outdated Show resolved Hide resolved
src/transformers/models/x_clip/modeling_x_clip.py Outdated Show resolved Hide resolved
src/transformers/models/x_clip/modeling_x_clip.py Outdated Show resolved Hide resolved
@NielsRogge
Copy link
Contributor Author

@sgugger and @alaradirik - the PR is ready for merge. Kindly asking for your approval :)

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, thanks again for adding this model!

src/transformers/models/auto/tokenization_auto.py Outdated Show resolved Hide resolved
@NielsRogge NielsRogge merged commit bb6f6d5 into huggingface:main Sep 8, 2022
Copy link
Contributor

@alaradirik alaradirik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thanks for adding this

oneraghavan pushed a commit to oneraghavan/transformers that referenced this pull request Sep 26, 2022
* First draft

* Improve conversion script

* Make vision encoder work

* More improvements

* Improve conversion script

* Fix quality

* Add MultiframeIntegrationTransformer

* More improvements

* Make MiT output work

* Fix quality

* Add prompts generator

* Add tests

* Fix some tests

* Fix some more tests

* Fix more tests

* Improve conversion script

* Fix model outputs

* Fix more tests

* Add XClipProcessor

* Use processor in conversion script

* Fix integration test

* Update README, fix docs

* Fix all tests

* Add MIT output to XClipOutput

* Create better variable names

* Rename XClip to XCLIP

* Extend conversion script

* Add support for large models

* Add support for 16 frame models

* Add another model'

* Fix module issue

* Apply suggestions from code review

* Add figure to docs

* Fix CLIPProcessor issue

* Apply suggestions from code review

* Delete file

* Convert more checkpoints

* Convert last checkpoint

* Update nielsr to microsoft
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants