New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Deformable DETR #17281
Add Deformable DETR #17281
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the PR wasn't necessarily in a state ready for review. Please make sure all docstrings are finished and code is generally cleaned up before asking reviewers to look.
src/transformers/models/deformable_detr/modeling_deformable_detr.py
Outdated
Show resolved
Hide resolved
src/transformers/models/deformable_detr/modeling_deformable_detr.py
Outdated
Show resolved
Hide resolved
src/transformers/models/deformable_detr/modeling_deformable_detr.py
Outdated
Show resolved
Hide resolved
b26d1f6
to
74ec0b5
Compare
src/transformers/models/deformable_detr/modeling_deformable_detr.py
Outdated
Show resolved
Hide resolved
Addressed most comments. I would like to have:
|
We have a |
6e224b8
to
7f5fb0a
Compare
@Narsil there's an issue with the pipeline tests, I added Also, CircleCI reports the following:
I might need some help with this. |
The generic tests will always run the model on CPU, so the best way is to discard this model from the test. Doing I would also add a slow GPU test that tries to use the pipeline directly if that's OK for the CI.
Does that make sense ? If it's hard to have a GPU test (not sure we ever call those anyway for pipelines, no @LysandreJik then we can do without but even if it's not auto tested there's value in creating the test IMO (it will run on local machines that try to run the test) |
As for the missing file, It's probably because the I don't really have good pointers for that since you seem to have added the correct line. The main advice would be to do |
8f1bfbe
to
ec61d72
Compare
OK, so looking at why the custom kernel fails to build:
This occurs quite often. The build is missing Try adding |
Additionally, if we start having custom cuda kernels that are enabled by default we must include |
so installing ninja did the trick of overcoming the initial hurdle. as commented above - if we make it work it should go into Now it's failing:
because CircleCI is cpu-only and doesn't have Basically your custom cuda kernel requires @ydshieh, do you by chance know if we are planning to get @NielsRogge, does this model work on CPU at all? i.e. is there a fallback to non-custom kernel in the absense of GPUs? If it is then the code should be modified to verify if there is a CUDA environment available and if it's not available not to load the custom kernel and everything will just work. |
The model only runs on GPU and requires the custom kernel. The authors do provide a CPU version here, but it's for "debugging and testing purposes only". |
The current CircleCI jobs use the docker image |
If it is not too much work to make running on both CPU/GPU work (considering the authors provide some implementation), I would advocate doing it - also mainly for "debugging and testing purposes only". |
Hmm I looked into the code, the problem is that their CPU version doesn't accept 2 arguments ( |
But we need: a. the modeling files not fail on b. the tests for such model should all be decorated with c. the testing will have to happen on our CI that has GPUs. which means no "real-time" testing. |
I've done this as seen here: NielsRogge@ec61d72. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not 100% sure investing time in a model that is only accessible on GPU is the best, as it restricts a lot the number of users that can play with it, and there won't be any regular tests or inference widget.
However this one is done, so let's finish this (just saying the above for the selection of future models we implement). The main problem for the tests is just the line
MSDA = load_cuda_kernels()
flagged above. It should be inside an if is_torch_cuda_available()
and the else branch should set the same object to None. Then all models should error at init if there is no GPU and the whole tests of those models should be decorated by the right require decorator.
src/transformers/models/deformable_detr/modeling_deformable_detr.py
Outdated
Show resolved
Hide resolved
Pinging @Narsil regarding excluding this model from the pipeline tests. |
Hi @NielsRogge , The best location to do this is in But the tests currently seem to be passing, so is this really necessary ? |
571a25f
to
191de2d
Compare
The documentation is not available anymore as the PR was closed or merged. |
PR is ready for review, by adding the model to the mappings this happens:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should resolve the errors in the pipeline tests.
@sgugger that didn't seem to fix the recursion error. |
bc68c55
to
48a26c6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for polishing this PR!
* First draft * More improvements * Improve model, add custom CUDA code * Import torch before * Add script that imports custom layer * Add everything in new ops directory * Import custom layer in modeling file * Fix ARCHIVE_MAP typo * Creating the custom kernel on the fly. * Import custom layer in modeling file * More improvements * Fix CUDA loading * More improvements * Improve conversion script * Improve conversion script * Make it work until encoder_outputs * Make forward pass work * More improvements * Make logits match original implementation * Make implementation also support single_scale model * Add support for single_scale and dilation checkpoint * Add support for with_box_refine model * Support also two stage model * Improve tests * Fix more tests * Make more tests pass * Upload all models to the hub * Clean up some code * Improve decoder outputs * Rename intermediate hidden states and reference points * Improve model outputs * Move tests to dedicated folder * Improve model outputs * Fix retain_grad test * Improve docs * Clean up and make test_initialization pass * Improve variable names * Add copied from statements * Improve docs * Fix style * Improve docs * Improve docs, move tests to model folder * Fix rebase * Remove DetrForSegmentation from auto mapping * Apply suggestions from code review * Improve variable names and docstrings * Apply some more suggestions from code review * Apply suggestion from code review * better docs and variables names * hint to num_queries and two_stage confusion * remove asserts and code refactor * add exception if two_stage is True and with_box_refine is False * use f-strings * Improve docs and variable names * Fix code quality * Fix rebase * Add require_torch_gpu decorator * Add pip install ninja to CI jobs * Apply suggestion of @sgugger * Remove DeformableDetrForObjectDetection from auto mapping * Remove DeformableDetrModel from auto mapping * Add model to toctree * Add model back to mappings, skip model in pipeline tests * Apply @sgugger's suggestion * Fix imports in the init * Fix copies * Add CPU implementation * Comment out GPU function * Undo previous change * Apply more suggestions * Remove require_torch_gpu annotator * Fix quality * Add logger.info * Fix logger * Fix variable names * Fix initializaztion * Add missing initialization * Update checkpoint name * Add model to doc tests * Add CPU/GPU equivalence test * Add Deformable DETR to pipeline tests * Skip model for object detection pipeline Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Nouamane Tazi <nouamane98@gmail.com> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
Hi @NielsRogge . I am following the finetuning notebook for DETR object detection. You have mentioned that DeformableDETR follows mostly same API. But I noticed that model based on Also, for the Feature-Extractor, I am confused whether we should opt for as per documentation to use To add further, I was wondering if we could add in the augmentation that the original paper follows from the official Repo. I managed to add augmentation based on functions available in official repo for Deformable-DETR. But not sure of the correctness. |
What does this PR do?
This PR implements Deformable DETR, which improves the original DETR using a new "deformable attention" module.
This model requires a custom CUDA kernel (hence it can only be run on GPU). Other than that, the API is entirely the same as DETR.
Models are on the hub.