Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TFConvNextModel #15750

Merged
merged 77 commits into from Feb 25, 2022
Merged
Show file tree
Hide file tree
Changes from 59 commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
d2a0848
feat: initial implementation of convnext in tensorflow.
sayakpaul Feb 8, 2022
66fc8fa
Merge branch 'master' into convnext-tf
sayakpaul Feb 8, 2022
583769c
fix: sample code for the classification model.
sayakpaul Feb 9, 2022
1e0a589
Merge branch 'master' into convnext-tf
sayakpaul Feb 9, 2022
c667d93
chore: added checked for from the classification model.
sayakpaul Feb 9, 2022
7aecfa9
chore: set bias initializer in the classification head.
sayakpaul Feb 9, 2022
222c465
chore: updated license terms.
sayakpaul Feb 9, 2022
835dbdb
chore: removed ununsed imports
sayakpaul Feb 9, 2022
d6f91b6
feat: enabled argument during using drop_path.
sayakpaul Feb 9, 2022
e92b6ce
Merge branch 'master' into convnext-tf
sayakpaul Feb 10, 2022
e1fec88
chore: replaced tf.identity with layers.Activation(linear).
sayakpaul Feb 10, 2022
30e4bcb
chore: edited default checkpoint.
sayakpaul Feb 11, 2022
b0051ac
fix: minor bugs in the initializations.
sayakpaul Feb 11, 2022
aeb14f7
partial-fix: tf model errors for loading pretrained pt weights.
sayakpaul Feb 11, 2022
aec69dc
partial-fix: call method updated
ariG23498 Feb 11, 2022
6c0fae2
partial-fix: cross loading of weights (4x3 variables to be matched)
sayakpaul Feb 12, 2022
ee62db4
chore: removed unneeded comment.
sayakpaul Feb 13, 2022
8c1d6a3
removed playground.py
sayakpaul Feb 13, 2022
490adf8
rebasing
sayakpaul Feb 13, 2022
fa49469
rebasing and removing playground.py.
sayakpaul Feb 13, 2022
18f0b0a
Merge branch 'master' into convnext-tf
sayakpaul Feb 13, 2022
acb6fa0
fix: renaming TFConvNextStage conv and layer norm layers
ariG23498 Feb 14, 2022
077ee25
Merge branch 'convnext-tf' of https://github.com/sayakpaul/transforme…
sayakpaul Feb 14, 2022
8d56711
chore: added initializers and other minor additions.
sayakpaul Feb 14, 2022
11b0683
chore: added initializers and other minor additions.
sayakpaul Feb 14, 2022
fd0ca7f
add: tests for convnext.
sayakpaul Feb 14, 2022
98911a2
fix: integration tester class.
sayakpaul Feb 14, 2022
b30a8cc
fix: issues mentioned in pr feedback (round 1).
sayakpaul Feb 16, 2022
2181d5b
fix: how output_hidden_states arg is propoagated inside the network.
sayakpaul Feb 16, 2022
cc98979
feat: handling of arg for pure cnn models.
sayakpaul Feb 16, 2022
12e4505
chore: added a note on equal contribution in model docs.
sayakpaul Feb 16, 2022
eb49338
rebasing
sayakpaul Feb 13, 2022
5e01b71
rebasing and removing playground.py.
sayakpaul Feb 13, 2022
3bd1c92
Merge branch 'master' of https://github.com/sayakpaul/transformers
sayakpaul Feb 16, 2022
3aefac7
Merge branch 'master' into convnext-tf
sayakpaul Feb 16, 2022
908d0cf
feat: encapsulation for the convnext trunk.
sayakpaul Feb 17, 2022
d386cf8
Fix variable naming; Test-related corrections; Run make fixup
gante Feb 18, 2022
15c916f
chore: added Joao as a contributor to convnext.
sayakpaul Feb 21, 2022
05b8273
rebasing
sayakpaul Feb 13, 2022
d247441
rebasing and removing playground.py.
sayakpaul Feb 13, 2022
bb8e6c2
rebasing
sayakpaul Feb 13, 2022
3b5366d
rebasing and removing playground.py.
sayakpaul Feb 13, 2022
d375214
Merge branch 'master' into convnext-tf
sayakpaul Feb 21, 2022
49b35cd
chore: corrected copyright year and added comment on NHWC.
sayakpaul Feb 21, 2022
d9b5079
chore: fixed the black version and ran formatting.
sayakpaul Feb 21, 2022
4b4737f
chore: ran make style.
sayakpaul Feb 21, 2022
2322a5f
chore: removed from_pt argument from test, ran make style.
sayakpaul Feb 22, 2022
61ae121
rebasing
sayakpaul Feb 13, 2022
b568377
rebasing and removing playground.py.
sayakpaul Feb 13, 2022
1259bf8
rebasing
sayakpaul Feb 13, 2022
96c1ea4
rebasing and removing playground.py.
sayakpaul Feb 13, 2022
0f98fb5
Merge branch 'master' into convnext-tf
sayakpaul Feb 22, 2022
b197216
fix: tests in the convnext subclass, ran make style.
sayakpaul Feb 24, 2022
7dcd98a
rebasing
sayakpaul Feb 13, 2022
95fffed
rebasing and removing playground.py.
sayakpaul Feb 13, 2022
f8129a1
rebasing
sayakpaul Feb 13, 2022
dab6866
rebasing and removing playground.py.
sayakpaul Feb 13, 2022
9d6b8ad
Merge branch 'master' into convnext-tf
sayakpaul Feb 24, 2022
69b5413
chore: moved convnext test to the correct location
sayakpaul Feb 24, 2022
15c6814
fix: locations for the test file of convnext.
sayakpaul Feb 24, 2022
e39c41b
Merge branch 'fix/convnext-tf' into convnext-tf
sayakpaul Feb 24, 2022
98111f8
fix: convnext tests.
sayakpaul Feb 24, 2022
3e06942
chore: applied sgugger's suggestion for dealing w/ output_attentions.
sayakpaul Feb 24, 2022
bc46016
chore: added comments.
sayakpaul Feb 24, 2022
06e19cd
chore: applied updated quality enviornment style.
sayakpaul Feb 24, 2022
229a817
chore: applied formatting with quality enviornment.
sayakpaul Feb 24, 2022
ad5d7e0
chore: revert to the previous tests/test_modeling_common.py.
sayakpaul Feb 24, 2022
4dea175
chore: revert to the original test_modeling_common.py
sayakpaul Feb 24, 2022
0f8069d
chore: revert to previous states for test_modeling_tf_common.py and m…
sayakpaul Feb 25, 2022
f4292b4
fix: tests for convnext.
sayakpaul Feb 25, 2022
8b99c8e
chore: removed output_attentions argument from convnext config.
sayakpaul Feb 25, 2022
7819850
chore: revert to the earlier tf utils.
sayakpaul Feb 25, 2022
ba9484f
fix: output shapes of the hidden states
sayakpaul Feb 25, 2022
553bac5
chore: removed unnecessary comment
sayakpaul Feb 25, 2022
d22e0cb
chore: reverting to the right test_modeling_tf_common.py.
sayakpaul Feb 25, 2022
de00fb2
Styling nits
sgugger Feb 25, 2022
b2309fe
Merge pull request #3 from huggingface/tfconvnext
sayakpaul Feb 25, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/source/index.mdx
Expand Up @@ -179,7 +179,7 @@ Flax), PyTorch, and/or TensorFlow.
| Canine | | | | | |
| CLIP | | | | | |
| ConvBERT | | | | | |
| ConvNext | | | | | |
| ConvNext | | | | | |
| CTRL | | | | | |
| DeBERTa | | | | | |
| DeBERTa-v2 | | | | | |
Expand Down
17 changes: 15 additions & 2 deletions docs/source/model_doc/convnext.mdx
Expand Up @@ -37,7 +37,8 @@ alt="drawing" width="600"/>

<small> ConvNeXT architecture. Taken from the <a href="https://arxiv.org/abs/2201.03545">original paper</a>.</small>

This model was contributed by [nielsr](https://huggingface.co/nielsr). The original code can be found [here](https://github.com/facebookresearch/ConvNeXt).
This model was contributed by [nielsr](https://huggingface.co/nielsr). TensorFlow version of the model was contributed by [ariG23498](https://github.com/ariG23498),
[gante](https://github.com/gante), and [sayakpaul](https://github.com/sayakpaul) (equal contribution). The original code can be found [here](https://github.com/facebookresearch/ConvNeXt).

## ConvNeXT specific outputs

Expand All @@ -63,4 +64,16 @@ This model was contributed by [nielsr](https://huggingface.co/nielsr). The origi
## ConvNextForImageClassification

[[autodoc]] ConvNextForImageClassification
- forward
- forward


## TFConvNextModel

[[autodoc]] TFConvNextModel
- call


## TFConvNextForImageClassification

[[autodoc]] TFConvNextForImageClassification
- call
8 changes: 8 additions & 0 deletions src/transformers/__init__.py
Expand Up @@ -1743,6 +1743,13 @@
"TFConvBertPreTrainedModel",
]
)
_import_structure["models.convnext"].extend(
[
"TFConvNextForImageClassification",
"TFConvNextModel",
"TFConvNextPreTrainedModel",
]
)
_import_structure["models.ctrl"].extend(
[
"TF_CTRL_PRETRAINED_MODEL_ARCHIVE_LIST",
Expand Down Expand Up @@ -3751,6 +3758,7 @@
TFConvBertModel,
TFConvBertPreTrainedModel,
)
from .models.convnext import TFConvNextForImageClassification, TFConvNextModel, TFConvNextPreTrainedModel
from .models.ctrl import (
TF_CTRL_PRETRAINED_MODEL_ARCHIVE_LIST,
TFCTRLForSequenceClassification,
Expand Down
7 changes: 4 additions & 3 deletions src/transformers/modeling_tf_utils.py
Expand Up @@ -311,9 +311,10 @@ def booleans_processing(config, **kwargs):
final_booleans = {}

if tf.executing_eagerly():
final_booleans["output_attentions"] = (
kwargs["output_attentions"] if kwargs["output_attentions"] is not None else config.output_attentions
)
final_booleans["output_attentions"] = kwargs.get("output_attentions", None)
if not final_booleans["output_attentions"]:
final_booleans["output_attentions"] = config.output_attentions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is not related to this PR and the test makes it so that the config will override the value output_attentions passed if it's False which should not be the case. The test should be

Suggested change
final_booleans["output_attentions"] = kwargs.get("output_attentions", None)
if not final_booleans["output_attentions"]:
final_booleans["output_attentions"] = config.output_attentions
final_booleans["output_attentions"] = kwargs.get("output_attentions", None)
if final_booleans["output_attentions"] is None:
final_booleans["output_attentions"] = config.output_attentions

and this should really be in its own PR if it's fixing a bug.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this change, I don't think it'd be possible to deal with output_attentions argument in the TF model.

and this should really be in its own PR if it's fixing a bug.

What should be done for the PR then?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just adding the context (if not clear yet): booleans_processing() assumes that output_attentions will be in kwargs.
This is true so far, as we always add output_attentions as an argument to the model arguments.

ConvNextModel is the 1st (?) one that doesn't has output_attentions argument.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sgugger @ydshieh just applied the change and added a comment to explain why it's required.


final_booleans["output_hidden_states"] = (
kwargs["output_hidden_states"]
if kwargs["output_hidden_states"] is not None
Expand Down
2 changes: 2 additions & 0 deletions src/transformers/models/auto/modeling_tf_auto.py
Expand Up @@ -36,6 +36,7 @@
("rembert", "TFRemBertModel"),
("roformer", "TFRoFormerModel"),
("convbert", "TFConvBertModel"),
("convnext", "TFConvNextModel"),
("led", "TFLEDModel"),
("lxmert", "TFLxmertModel"),
("mt5", "TFMT5Model"),
Expand Down Expand Up @@ -155,6 +156,7 @@
[
# Model for Image-classsification
("vit", "TFViTForImageClassification"),
("convnext", "TFConvNextForImageClassification"),
]
)

Expand Down
11 changes: 10 additions & 1 deletion src/transformers/models/convnext/__init__.py
Expand Up @@ -18,7 +18,7 @@
from typing import TYPE_CHECKING

# rely on isort to merge the imports
from ...file_utils import _LazyModule, is_torch_available, is_vision_available
from ...file_utils import _LazyModule, is_tf_available, is_torch_available, is_vision_available


_import_structure = {
Expand All @@ -36,6 +36,12 @@
"ConvNextPreTrainedModel",
]

if is_tf_available():
_import_structure["modeling_tf_convnext"] = [
"TFConvNextForImageClassification",
"TFConvNextModel",
"TFConvNextPreTrainedModel",
]

if TYPE_CHECKING:
from .configuration_convnext import CONVNEXT_PRETRAINED_CONFIG_ARCHIVE_MAP, ConvNextConfig
Expand All @@ -51,6 +57,9 @@
ConvNextPreTrainedModel,
)

if is_tf_available():
from .modeling_convnext import TFConvNextForImageClassification, TFConvNextModel, TFConvNextPreTrainedModel


else:
import sys
Expand Down
3 changes: 3 additions & 0 deletions src/transformers/models/convnext/configuration_convnext.py
Expand Up @@ -85,6 +85,7 @@ def __init__(
is_encoder_decoder=False,
layer_scale_init_value=1e-6,
drop_path_rate=0.0,
image_size=224,
**kwargs
):
super().__init__(**kwargs)
Expand All @@ -99,3 +100,5 @@ def __init__(
self.layer_norm_eps = layer_norm_eps
self.layer_scale_init_value = layer_scale_init_value
self.drop_path_rate = drop_path_rate
self.image_size = image_size
self.output_attentions = None
Copy link
Collaborator

@ydshieh ydshieh Feb 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line causes PyTorch test failed

raise ValueError(f"The following keys were not properly set in the config:\n{errors}")

I understand why you set None here. But test_configuration_common requires it to be assigned the value specified in kwargs. I will leave others to see what would be a solution.

Error message on CircleCI

self = <tests.test_configuration_common.ConfigTester object at 0x7f498c720dd0>

    def check_config_arguments_init(self):
        kwargs = copy.deepcopy(config_common_kwargs)
        config = self.config_class(**kwargs)
        wrong_values = []
        for key, value in config_common_kwargs.items():
            if key == "torch_dtype":
                if not is_torch_available():
                    continue
                else:
                    import torch
    
                    if config.torch_dtype != torch.float16:
                        wrong_values.append(("torch_dtype", config.torch_dtype, torch.float16))
            elif getattr(config, key) != value:
                wrong_values.append((key, getattr(config, key), value))
    
        if len(wrong_values) > 0:
            errors = "\n".join([f"- {v[0]}: got {v[1]} instead of {v[2]}" for v in wrong_values])
>           raise ValueError(f"The following keys were not properly set in the config:\n{errors}")
E           ValueError: The following keys were not properly set in the config:
E           - output_attentions: got None instead of True

tests/test_configuration_common.py:191: ValueError

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah happy to discuss this with others.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should not be present indeed.