Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add YOLOS #16848

Merged
merged 30 commits into from May 2, 2022
Merged

Add YOLOS #16848

merged 30 commits into from May 2, 2022

Conversation

NielsRogge
Copy link
Contributor

@NielsRogge NielsRogge commented Apr 20, 2022

What does this PR do?

This PR adds YOLOS, an awesome and simple object detector.

YOLOS is just a single Transformer encoder (ViT), trained using DETR's objective.

For now, I've used "vit" as base_model_prefix, in order to easily load weights from ViT and ViTMAE checkpoints on the hub.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this new model! There are a few badly named variables left and some docstrings to fix, but overall it's in great shape!

docs/source/en/model_doc/yolos.mdx Outdated Show resolved Hide resolved
src/transformers/models/yolos/configuration_yolos.py Outdated Show resolved Hide resolved
src/transformers/models/yolos/modeling_yolos.py Outdated Show resolved Hide resolved
src/transformers/models/yolos/modeling_yolos.py Outdated Show resolved Hide resolved
src/transformers/models/yolos/modeling_yolos.py Outdated Show resolved Hide resolved
src/transformers/models/yolos/modeling_yolos.py Outdated Show resolved Hide resolved
src/transformers/models/yolos/modeling_yolos.py Outdated Show resolved Hide resolved
tests/yolos/test_modeling_yolos.py Outdated Show resolved Hide resolved
tests/yolos/test_modeling_yolos.py Outdated Show resolved Hide resolved
@NielsRogge
Copy link
Contributor Author

NielsRogge commented Apr 26, 2022

Addressed most comments. The remaining comments are about badly formatted docstrings, however these are all copied from DETR (so I can't change them due to #Copied from statements). Is it ok if I address these docstrings in a separate PR for both models?

Also pinging @Narsil as the pipeline test for YOLOS is failing. This is because YOLOS doesn't take pixel_mask as input, whereas DETR does. This makes YOLOS fail for the object detection pipeline.

@sgugger
Copy link
Collaborator

sgugger commented Apr 26, 2022

I'd advocate to make the changes in docstrings in DETR to be propagated to YOLOS in this PR, just to make sure we don't forget.

@Narsil
Copy link
Contributor

Narsil commented Apr 26, 2022

Also pinging @Narsil as the pipeline test for YOLOS is failing. This is because YOLOS doesn't take pixel_mask as input, whereas DETR does. This makes YOLOS fail for the object detection pipeline.

Then the feature_extractor should not output them. The image pipeline are pretty simple and roughly simply do

model(**feature_extractor(image)) so if the feature extractor only outputs what's needed then it should work.
That or pixel_mask should be handled (doesn't seem to be making sense for this model reading your comment).

@NielsRogge
Copy link
Contributor Author

NielsRogge commented Apr 26, 2022

Then the feature_extractor should not output them.

Yeah the problem is, YOLOS uses the same feature extractor as DETR, which outputs both pixel_values and pixel_mask. Hence, I've just added ("yolos", "DetrFeatureExtractor") to the Auto Feature Extractor API.

I think the easiest here is to add pixel_mask=None to the forward of YOLOS, in order to make model(**feature_extractor(image)) work.

@sgugger
Copy link
Collaborator

sgugger commented Apr 26, 2022

I think the easiest here is to add pixel_mask=None to the forward of YOLOS, in order to make model(**feature_extractor(image)) work.

We're not adding an argument that will be ignored all the time, that's just confusing to users. Especially if they end up passing one and don't get why it's not used.

If the feature extractor should not return pixel_mask then either use a class attribute on DetrFeatureExtractor to make it not return that in certain cases, or create a YolosFeatureExtractor that removes that field from the output of the feature extractor.

@NielsRogge
Copy link
Contributor Author

NielsRogge commented Apr 26, 2022

Ok so I created a new YolosFeatureExtractor, however the pipeline test is still failing:

def _call_impl(self, *input, **kwargs):
        forward_call = (self._slow_forward if torch._C._get_tracing_state() else self.forward)
        # If we don't have any hooks, we want to skip the rest of the logic in
        # this function, and just call forward.
        if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
                or _global_forward_hooks or _global_forward_pre_hooks):
>           return forward_call(*input, **kwargs)
E           TypeError: forward() got an unexpected keyword argument 'pixel_mask'

@Narsil could you help me debug this? It's weird cause YolosFeatureExtractor doesn't create a pixel mask. Also, I added doc tests which are passing.

@Narsil
Copy link
Contributor

Narsil commented Apr 28, 2022

@Narsil could you help me debug this? It's weird cause YolosFeatureExtractor doesn't create a pixel mask. Also, I added doc tests which are passing.

I have checked and the reason is that the tested Feature extractor is actually a detr one, not a Yolo one:

https://github.com/huggingface/transformers/pull/16848/files#diff-fcbe32a3a065f97b00f1c242ecd45858b8d5680a2437b65af183eb0c439e2be9R347

The pipeline tests rely on ModelTester to create the base objects

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Apr 28, 2022

The documentation is not available anymore as the PR was closed or merged.

README.md Outdated Show resolved Hide resolved
@NielsRogge
Copy link
Contributor Author

Failing test is unrelated, merging.

@NielsRogge NielsRogge merged commit 1ac6987 into huggingface:main May 2, 2022
nandwalritik pushed a commit to nandwalritik/transformers that referenced this pull request May 3, 2022
* First draft

* Add YolosForObjectDetection

* Make forward pass work

* Add mid position embeddings

* Add interpolation of position encodings

* Add expected values

* Add YOLOS to tests

* Add integration test

* Support tiny model as well

* Support all models in conversion script

* Remove mid_pe_size attribute

* Make more tests pass

* Add model to README and fix config

* Add copied from statements

* Rename base_model_prefix to vit

* Add missing YOLOS_PRETRAINED_CONFIG_ARCHIVE_MAP

* Apply suggestions from code review

* Apply more suggestions from code review

* Convert remaining checkpoints

* Improve docstrings

* Add YolosFeatureExtractor

* Add feature extractor to docs

* Add corresponding tests

* Fix style

* Fix docs

* Apply suggestion from code review

* Fix bad rebase

* Fix some more bad rebase

* Fix missing character

* Improve docs and variable names

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
stevhliu pushed a commit to stevhliu/transformers that referenced this pull request May 3, 2022
* First draft

* Add YolosForObjectDetection

* Make forward pass work

* Add mid position embeddings

* Add interpolation of position encodings

* Add expected values

* Add YOLOS to tests

* Add integration test

* Support tiny model as well

* Support all models in conversion script

* Remove mid_pe_size attribute

* Make more tests pass

* Add model to README and fix config

* Add copied from statements

* Rename base_model_prefix to vit

* Add missing YOLOS_PRETRAINED_CONFIG_ARCHIVE_MAP

* Apply suggestions from code review

* Apply more suggestions from code review

* Convert remaining checkpoints

* Improve docstrings

* Add YolosFeatureExtractor

* Add feature extractor to docs

* Add corresponding tests

* Fix style

* Fix docs

* Apply suggestion from code review

* Fix bad rebase

* Fix some more bad rebase

* Fix missing character

* Improve docs and variable names

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
nandwalritik pushed a commit to nandwalritik/transformers that referenced this pull request May 4, 2022
* First draft

* Add YolosForObjectDetection

* Make forward pass work

* Add mid position embeddings

* Add interpolation of position encodings

* Add expected values

* Add YOLOS to tests

* Add integration test

* Support tiny model as well

* Support all models in conversion script

* Remove mid_pe_size attribute

* Make more tests pass

* Add model to README and fix config

* Add copied from statements

* Rename base_model_prefix to vit

* Add missing YOLOS_PRETRAINED_CONFIG_ARCHIVE_MAP

* Apply suggestions from code review

* Apply more suggestions from code review

* Convert remaining checkpoints

* Improve docstrings

* Add YolosFeatureExtractor

* Add feature extractor to docs

* Add corresponding tests

* Fix style

* Fix docs

* Apply suggestion from code review

* Fix bad rebase

* Fix some more bad rebase

* Fix missing character

* Improve docs and variable names

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
nandwalritik pushed a commit to nandwalritik/transformers that referenced this pull request May 4, 2022
* First draft

* Add YolosForObjectDetection

* Make forward pass work

* Add mid position embeddings

* Add interpolation of position encodings

* Add expected values

* Add YOLOS to tests

* Add integration test

* Support tiny model as well

* Support all models in conversion script

* Remove mid_pe_size attribute

* Make more tests pass

* Add model to README and fix config

* Add copied from statements

* Rename base_model_prefix to vit

* Add missing YOLOS_PRETRAINED_CONFIG_ARCHIVE_MAP

* Apply suggestions from code review

* Apply more suggestions from code review

* Convert remaining checkpoints

* Improve docstrings

* Add YolosFeatureExtractor

* Add feature extractor to docs

* Add corresponding tests

* Fix style

* Fix docs

* Apply suggestion from code review

* Fix bad rebase

* Fix some more bad rebase

* Fix missing character

* Improve docs and variable names

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
nandwalritik pushed a commit to nandwalritik/transformers that referenced this pull request May 4, 2022
* First draft

* Add YolosForObjectDetection

* Make forward pass work

* Add mid position embeddings

* Add interpolation of position encodings

* Add expected values

* Add YOLOS to tests

* Add integration test

* Support tiny model as well

* Support all models in conversion script

* Remove mid_pe_size attribute

* Make more tests pass

* Add model to README and fix config

* Add copied from statements

* Rename base_model_prefix to vit

* Add missing YOLOS_PRETRAINED_CONFIG_ARCHIVE_MAP

* Apply suggestions from code review

* Apply more suggestions from code review

* Convert remaining checkpoints

* Improve docstrings

* Add YolosFeatureExtractor

* Add feature extractor to docs

* Add corresponding tests

* Fix style

* Fix docs

* Apply suggestion from code review

* Fix bad rebase

* Fix some more bad rebase

* Fix missing character

* Improve docs and variable names

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
elusenji pushed a commit to elusenji/transformers that referenced this pull request Jun 12, 2022
* First draft

* Add YolosForObjectDetection

* Make forward pass work

* Add mid position embeddings

* Add interpolation of position encodings

* Add expected values

* Add YOLOS to tests

* Add integration test

* Support tiny model as well

* Support all models in conversion script

* Remove mid_pe_size attribute

* Make more tests pass

* Add model to README and fix config

* Add copied from statements

* Rename base_model_prefix to vit

* Add missing YOLOS_PRETRAINED_CONFIG_ARCHIVE_MAP

* Apply suggestions from code review

* Apply more suggestions from code review

* Convert remaining checkpoints

* Improve docstrings

* Add YolosFeatureExtractor

* Add feature extractor to docs

* Add corresponding tests

* Fix style

* Fix docs

* Apply suggestion from code review

* Fix bad rebase

* Fix some more bad rebase

* Fix missing character

* Improve docs and variable names

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants