Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Data2Vec #15507

Merged
merged 128 commits into from Mar 1, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
128 commits
Select commit Hold shift + click to select a range
faa4d1d
Add data2vec model cloned from roberta
Feb 3, 2022
9e79329
Add checkpoint conversion script
Feb 3, 2022
47e6bbc
Fix copies
Feb 3, 2022
5e70a95
Update docs
Feb 3, 2022
2eddba9
Add checkpoint conversion script
Feb 3, 2022
991e6d9
Remove fairseq data2vec_text script and fix format
Feb 5, 2022
caeb28d
Add comment on where to get data2vec_text.py
Feb 5, 2022
4c0565e
Remove mock implementation cheat.py and fix style
Feb 5, 2022
3517038
Fix copies
Feb 5, 2022
54d0f37
Remove TF and Flax classes from init
Feb 5, 2022
81f36f7
Add back copy from fairseq data2vec_text.py and fix style
Feb 5, 2022
7c3ec90
Update model name in docs/source/index.mdx to be CamelCase
Feb 5, 2022
2a652c4
Revert model name in table to lower-case to get check_table test to pass
Feb 5, 2022
65219fb
Update src/transformers/models/data2vec/__init__.py
edugp Feb 7, 2022
432fef5
Update src/transformers/models/data2vec/convert_data2vec_original_pyt…
edugp Feb 7, 2022
6aa237d
Update src/transformers/models/data2vec/modeling_data2vec.py
edugp Feb 7, 2022
c82c7b2
Update src/transformers/models/data2vec/modeling_data2vec.py
edugp Feb 7, 2022
96c0899
Update src/transformers/models/data2vec/modeling_data2vec.py
edugp Feb 7, 2022
dd40020
Update src/transformers/models/data2vec/modeling_data2vec.py
edugp Feb 7, 2022
5705951
Update docs/source/model_doc/data2vec.mdx
edugp Feb 7, 2022
aa38abf
Update docs/source/model_doc/data2vec.mdx
edugp Feb 7, 2022
d0936e8
Update src/transformers/models/auto/configuration_auto.py
edugp Feb 7, 2022
3a058d7
Update src/transformers/models/data2vec/configuration_data2vec.py
edugp Feb 7, 2022
6d73033
Update src/transformers/models/data2vec/modeling_data2vec.py
edugp Feb 7, 2022
a555792
Update src/transformers/models/data2vec/modeling_data2vec.py
edugp Feb 7, 2022
5b7dff1
Update src/transformers/models/data2vec/modeling_data2vec.py
edugp Feb 7, 2022
492f510
Update tests/test_modeling_data2vec.py
edugp Feb 7, 2022
5e96b70
Update src/transformers/models/data2vec/configuration_data2vec.py
edugp Feb 7, 2022
34d045c
Update src/transformers/models/data2vec/modeling_data2vec.py
edugp Feb 7, 2022
d239d63
Update documentation
Feb 8, 2022
3df8a74
Merge branch 'add-data2vec-from-roberta' of https://github.com/edugp/…
Feb 8, 2022
3e6cd53
Copy-paste Data2VecConfig from BertConfig
Feb 9, 2022
91c03bd
Update config checkpoint to point to edugp/data2vec-nlp-base. Fix sty…
Feb 9, 2022
7fa2e2a
Update config special tokens to match RoBERTa
Feb 9, 2022
988bbf0
Split multiple assertions and add individual error messages
Feb 9, 2022
b796850
Rename Data2VecModel to Data2VecForTextModel
Feb 9, 2022
0ad60a6
Add Data2Vec to _toctree.yml
Feb 9, 2022
913de4f
Rename Data2VecEmbeddings to Data2VecForTextEmbeddings
Feb 9, 2022
8e5902f
Add initial Data2VecForAudio model (unfinished). Only matching fairse…
Feb 16, 2022
2830260
finish audio model
patrickvonplaten Feb 18, 2022
23b7de8
finish audio file
patrickvonplaten Feb 18, 2022
9e5ac32
Update names and fix style, quality and repo consistency
Feb 20, 2022
0d2cf13
Remove Data2VecAudioForPretraining. Add tests for Data2VecAudio, mimi…
Feb 22, 2022
68491bc
Merge branch 'master' of https://github.com/huggingface/transformers …
patrickvonplaten Feb 23, 2022
0b6e4ba
add inputs to logits to data2vec'
patrickvonplaten Feb 23, 2022
5ae35a7
Merge branch 'add-data2vec-from-roberta' of https://github.com/edugp/…
patrickvonplaten Feb 23, 2022
d3e6d27
correct autio models
patrickvonplaten Feb 23, 2022
b55f326
correct config auto
patrickvonplaten Feb 23, 2022
4ff05bb
correct tok auto
patrickvonplaten Feb 23, 2022
f216196
Update utils/tests_fetcher.py
patrickvonplaten Feb 23, 2022
c98558e
delete unnecessary files
patrickvonplaten Feb 23, 2022
2d260f2
delete unnecessary files
patrickvonplaten Feb 23, 2022
050a159
Merge branch 'add-data2vec-from-roberta' of https://github.com/edugp/…
patrickvonplaten Feb 23, 2022
fee4f8d
further renaming
patrickvonplaten Feb 23, 2022
d0a7cf9
make all tests pass
patrickvonplaten Feb 23, 2022
2b958d2
finish
patrickvonplaten Feb 23, 2022
89d8f9b
remove useless test file
patrickvonplaten Feb 23, 2022
86cc898
Update tests/test_modeling_common.py
patrickvonplaten Feb 23, 2022
a2de595
Update utils/check_repo.py
edugp Feb 23, 2022
dafc36d
Update src/transformers/models/data2vec/modeling_data2vec_text.py
edugp Feb 23, 2022
985ed72
Merge branch 'huggingface:master' into master
edugp Feb 23, 2022
79994cf
Fix copies
Feb 3, 2022
b368a6e
Update docs
Feb 3, 2022
e2dbdb2
Remove fairseq data2vec_text script and fix format
Feb 5, 2022
e670461
Add comment on where to get data2vec_text.py
Feb 5, 2022
3d21d60
Remove mock implementation cheat.py and fix style
Feb 5, 2022
16c8361
Fix copies
Feb 5, 2022
4dc16f3
Remove TF and Flax classes from init
Feb 5, 2022
3f1efe1
Add back copy from fairseq data2vec_text.py and fix style
Feb 5, 2022
bf8dd78
Update model name in docs/source/index.mdx to be CamelCase
Feb 5, 2022
7917766
Revert model name in table to lower-case to get check_table test to pass
Feb 5, 2022
cdbb4b7
Update documentation
Feb 8, 2022
578cfd8
Update src/transformers/models/data2vec/__init__.py
edugp Feb 7, 2022
cda97e9
Update src/transformers/models/data2vec/convert_data2vec_original_pyt…
edugp Feb 7, 2022
03b41fc
Update src/transformers/models/data2vec/modeling_data2vec.py
edugp Feb 7, 2022
244e52e
Update src/transformers/models/data2vec/modeling_data2vec.py
edugp Feb 7, 2022
d4c82e2
Update src/transformers/models/data2vec/modeling_data2vec.py
edugp Feb 7, 2022
79eaf6c
Update src/transformers/models/data2vec/modeling_data2vec.py
edugp Feb 7, 2022
17f5f86
Update src/transformers/models/auto/configuration_auto.py
edugp Feb 7, 2022
309a43d
Update src/transformers/models/data2vec/configuration_data2vec.py
edugp Feb 7, 2022
acd9cd4
Update src/transformers/models/data2vec/modeling_data2vec.py
edugp Feb 7, 2022
c306788
Update src/transformers/models/data2vec/modeling_data2vec.py
edugp Feb 7, 2022
b440a4b
Update src/transformers/models/data2vec/modeling_data2vec.py
edugp Feb 7, 2022
835ca7f
Update tests/test_modeling_data2vec.py
edugp Feb 7, 2022
ab52bbf
Update src/transformers/models/data2vec/configuration_data2vec.py
edugp Feb 7, 2022
ec8d4b3
Update src/transformers/models/data2vec/modeling_data2vec.py
edugp Feb 7, 2022
5c98ce2
Copy-paste Data2VecConfig from BertConfig
Feb 9, 2022
f8880bc
Update config checkpoint to point to edugp/data2vec-nlp-base. Fix sty…
Feb 9, 2022
0a0a8de
Update config special tokens to match RoBERTa
Feb 9, 2022
2b26499
Split multiple assertions and add individual error messages
Feb 9, 2022
c5d3736
Rename Data2VecModel to Data2VecForTextModel
Feb 9, 2022
84c6ad1
Add Data2Vec to _toctree.yml
Feb 9, 2022
d447c90
Rename Data2VecEmbeddings to Data2VecForTextEmbeddings
Feb 9, 2022
46c0c88
Add initial Data2VecForAudio model (unfinished). Only matching fairse…
Feb 16, 2022
960fe56
finish audio model
patrickvonplaten Feb 18, 2022
f88162c
finish audio file
patrickvonplaten Feb 18, 2022
1a24eae
add inputs to logits to data2vec'
patrickvonplaten Feb 23, 2022
71d8b74
Update names and fix style, quality and repo consistency
Feb 20, 2022
bbd3846
Remove Data2VecAudioForPretraining. Add tests for Data2VecAudio, mimi…
Feb 22, 2022
b1365f7
correct autio models
patrickvonplaten Feb 23, 2022
796ab6e
correct config auto
patrickvonplaten Feb 23, 2022
71be483
correct tok auto
patrickvonplaten Feb 23, 2022
65e80dd
delete unnecessary files
patrickvonplaten Feb 23, 2022
553fb11
delete unnecessary files
patrickvonplaten Feb 23, 2022
6d8c952
Update utils/tests_fetcher.py
patrickvonplaten Feb 23, 2022
4f22fcb
further renaming
patrickvonplaten Feb 23, 2022
45fb62f
make all tests pass
patrickvonplaten Feb 23, 2022
8cde36a
finish
patrickvonplaten Feb 23, 2022
c6a49e9
remove useless test file
patrickvonplaten Feb 23, 2022
b926661
Update tests/test_modeling_common.py
patrickvonplaten Feb 23, 2022
432e42d
Update utils/check_repo.py
edugp Feb 23, 2022
a3ce025
Update src/transformers/models/data2vec/modeling_data2vec_text.py
edugp Feb 23, 2022
3dc71fe
Fix conflict
Feb 23, 2022
0780b03
Move data2vec tests to new structure
Feb 24, 2022
de7f649
Fix test imports for text tests
Feb 24, 2022
f095e35
Remove fairseq files
Feb 24, 2022
7fb3234
Change paper link to arxiv
Feb 24, 2022
cdf60e8
Modify Data2Vec documentation to reflect that the encoder is not shar…
Feb 24, 2022
166217f
Update text model checkpoint to be facebook/data2vec-text-base
Feb 24, 2022
98df301
Add 'Copy from' statements and update paper links and docs
Feb 24, 2022
7d1a3e7
Merge branch 'master' of https://github.com/huggingface/transformers …
patrickvonplaten Feb 25, 2022
02d9e5e
fix copy from statements
patrickvonplaten Feb 25, 2022
5b93a64
improve copied from
patrickvonplaten Feb 25, 2022
a149c18
correct more copied from statements
patrickvonplaten Feb 25, 2022
7708caa
finish copied from stuff
patrickvonplaten Feb 25, 2022
b9e1fe3
make style
patrickvonplaten Feb 25, 2022
3389304
add model to README
patrickvonplaten Feb 25, 2022
0716731
add to master
patrickvonplaten Feb 25, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Expand Up @@ -249,6 +249,7 @@ Current number of checkpoints: ![](https://img.shields.io/endpoint?url=https://h
1. **[ConvBERT](https://huggingface.co/docs/transformers/model_doc/convbert)** (from YituTech) released with the paper [ConvBERT: Improving BERT with Span-based Dynamic Convolution](https://arxiv.org/abs/2008.02496) by Zihang Jiang, Weihao Yu, Daquan Zhou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan.
1. **[CPM](https://huggingface.co/docs/transformers/model_doc/cpm)** (from Tsinghua University) released with the paper [CPM: A Large-scale Generative Chinese Pre-trained Language Model](https://arxiv.org/abs/2012.00413) by Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun.
1. **[CTRL](https://huggingface.co/docs/transformers/model_doc/ctrl)** (from Salesforce) released with the paper [CTRL: A Conditional Transformer Language Model for Controllable Generation](https://arxiv.org/abs/1909.05858) by Nitish Shirish Keskar*, Bryan McCann*, Lav R. Varshney, Caiming Xiong and Richard Socher.
1. **[Data2Vec](https://huggingface.co/docs/transformers/master/model_doc/data2vec)** (from Facebook) released with the paper [Data2Vec: A General Framework for Self-supervised Learning in Speech, Vision and Language](https://arxiv.org/abs/2202.03555) by Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, Michael Auli.
1. **[DeBERTa](https://huggingface.co/docs/transformers/model_doc/deberta)** (from Microsoft) released with the paper [DeBERTa: Decoding-enhanced BERT with Disentangled Attention](https://arxiv.org/abs/2006.03654) by Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen.
1. **[DeBERTa-v2](https://huggingface.co/docs/transformers/model_doc/deberta-v2)** (from Microsoft) released with the paper [DeBERTa: Decoding-enhanced BERT with Disentangled Attention](https://arxiv.org/abs/2006.03654) by Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen.
1. **[DeiT](https://huggingface.co/docs/transformers/model_doc/deit)** (from Facebook) released with the paper [Training data-efficient image transformers & distillation through attention](https://arxiv.org/abs/2012.12877) by Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou.
Expand Down
1 change: 1 addition & 0 deletions README_ko.md
Expand Up @@ -230,6 +230,7 @@ Flax, PyTorch, TensorFlow 설치 페이지에서 이들을 conda로 설치하는
1. **[ConvNeXT](https://huggingface.co/docs/transformers/master/model_doc/convnext)** (from Facebook AI) released with the paper [A ConvNet for the 2020s](https://arxiv.org/abs/2201.03545) by Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie.
1. **[CPM](https://huggingface.co/docs/transformers/model_doc/cpm)** (from Tsinghua University) released with the paper [CPM: A Large-scale Generative Chinese Pre-trained Language Model](https://arxiv.org/abs/2012.00413) by Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun.
1. **[CTRL](https://huggingface.co/docs/transformers/model_doc/ctrl)** (from Salesforce) released with the paper [CTRL: A Conditional Transformer Language Model for Controllable Generation](https://arxiv.org/abs/1909.05858) by Nitish Shirish Keskar*, Bryan McCann*, Lav R. Varshney, Caiming Xiong and Richard Socher.
1. **[Data2Vec](https://huggingface.co/docs/transformers/master/model_doc/data2vec)** (from Facebook) released with the paper [Data2Vec: A General Framework for Self-supervised Learning in Speech, Vision and Language](https://arxiv.org/abs/2202.03555) by Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, Michael Auli.
1. **[DeBERTa](https://huggingface.co/docs/transformers/model_doc/deberta)** (from Microsoft) released with the paper [DeBERTa: Decoding-enhanced BERT with Disentangled Attention](https://arxiv.org/abs/2006.03654) by Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen.
1. **[DeBERTa-v2](https://huggingface.co/docs/transformers/model_doc/deberta-v2)** (from Microsoft) released with the paper [DeBERTa: Decoding-enhanced BERT with Disentangled Attention](https://arxiv.org/abs/2006.03654) by Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen.
1. **[DeiT](https://huggingface.co/docs/transformers/model_doc/deit)** (from Facebook) released with the paper [Training data-efficient image transformers & distillation through attention](https://arxiv.org/abs/2012.12877) by Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou.
Expand Down
1 change: 1 addition & 0 deletions README_zh-hans.md
Expand Up @@ -254,6 +254,7 @@ conda install -c huggingface transformers
1. **[ConvNeXT](https://huggingface.co/docs/transformers/master/model_doc/convnext)** (来自 Facebook AI) 伴随论文 [A ConvNet for the 2020s](https://arxiv.org/abs/2201.03545) 由 Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie 发布。
1. **[CPM](https://huggingface.co/docs/transformers/model_doc/cpm)** (来自 Tsinghua University) 伴随论文 [CPM: A Large-scale Generative Chinese Pre-trained Language Model](https://arxiv.org/abs/2012.00413) 由 Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun 发布。
1. **[CTRL](https://huggingface.co/docs/transformers/model_doc/ctrl)** (来自 Salesforce) 伴随论文 [CTRL: A Conditional Transformer Language Model for Controllable Generation](https://arxiv.org/abs/1909.05858) 由 Nitish Shirish Keskar*, Bryan McCann*, Lav R. Varshney, Caiming Xiong and Richard Socher 发布。
1. **[Data2Vec](https://huggingface.co/docs/transformers/master/model_doc/data2vec)** (来自 Facebook) 伴随论文 [Data2Vec: A General Framework for Self-supervised Learning in Speech, Vision and Language](https://arxiv.org/abs/2202.03555) 由 Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, Michael Auli 发布。
1. **[DeBERTa](https://huggingface.co/docs/transformers/model_doc/deberta)** (来自 Microsoft) 伴随论文 [DeBERTa: Decoding-enhanced BERT with Disentangled Attention](https://arxiv.org/abs/2006.03654) 由 Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen 发布。
1. **[DeBERTa-v2](https://huggingface.co/docs/transformers/model_doc/deberta-v2)** (来自 Microsoft) 伴随论文 [DeBERTa: Decoding-enhanced BERT with Disentangled Attention](https://arxiv.org/abs/2006.03654) 由 Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen 发布。
1. **[DeiT](https://huggingface.co/docs/transformers/model_doc/deit)** (来自 Facebook) 伴随论文 [Training data-efficient image transformers & distillation through attention](https://arxiv.org/abs/2012.12877) 由 Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou 发布。
Expand Down
1 change: 1 addition & 0 deletions README_zh-hant.md
Expand Up @@ -266,6 +266,7 @@ conda install -c huggingface transformers
1. **[ConvNeXT](https://huggingface.co/docs/transformers/master/model_doc/convnext)** (from Facebook AI) released with the paper [A ConvNet for the 2020s](https://arxiv.org/abs/2201.03545) by Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie.
1. **[CPM](https://huggingface.co/docs/transformers/model_doc/cpm)** (from Tsinghua University) released with the paper [CPM: A Large-scale Generative Chinese Pre-trained Language Model](https://arxiv.org/abs/2012.00413) by Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun.
1. **[CTRL](https://huggingface.co/docs/transformers/model_doc/ctrl)** (from Salesforce) released with the paper [CTRL: A Conditional Transformer Language Model for Controllable Generation](https://arxiv.org/abs/1909.05858) by Nitish Shirish Keskar*, Bryan McCann*, Lav R. Varshney, Caiming Xiong and Richard Socher.
1. **[Data2Vec](https://huggingface.co/docs/transformers/master/model_doc/data2vec)** (from Facebook) released with the paper [Data2Vec: A General Framework for Self-supervised Learning in Speech, Vision and Language](https://arxiv.org/abs/2202.03555) by Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, Michael Auli.
1. **[DeBERTa](https://huggingface.co/docs/transformers/model_doc/deberta)** (from Microsoft) released with the paper [DeBERTa: Decoding-enhanced BERT with Disentangled Attention](https://arxiv.org/abs/2006.03654) by Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen.
1. **[DeBERTa-v2](https://huggingface.co/docs/transformers/model_doc/deberta-v2)** (from Microsoft) released with the paper [DeBERTa: Decoding-enhanced BERT with Disentangled Attention](https://arxiv.org/abs/2006.03654) by Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen.
1. **[DeiT](https://huggingface.co/docs/transformers/model_doc/deit)** (from Facebook) released with the paper [Training data-efficient image transformers & distillation through attention](https://arxiv.org/abs/2012.12877) by Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou.
Expand Down
2 changes: 2 additions & 0 deletions docs/source/_toctree.yml
Expand Up @@ -178,6 +178,8 @@
title: CPM
- local: model_doc/ctrl
title: CTRL
- local: model_doc/data2vec
title: Data2Vec
- local: model_doc/deberta
title: DeBERTa
- local: model_doc/deberta-v2
Expand Down
3 changes: 3 additions & 0 deletions docs/source/index.mdx
Expand Up @@ -75,6 +75,7 @@ conversion utilities for the following models.
1. **[ConvBERT](model_doc/convbert)** (from YituTech) released with the paper [ConvBERT: Improving BERT with Span-based Dynamic Convolution](https://arxiv.org/abs/2008.02496) by Zihang Jiang, Weihao Yu, Daquan Zhou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan.
1. **[CPM](model_doc/cpm)** (from Tsinghua University) released with the paper [CPM: A Large-scale Generative Chinese Pre-trained Language Model](https://arxiv.org/abs/2012.00413) by Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun.
1. **[CTRL](model_doc/ctrl)** (from Salesforce) released with the paper [CTRL: A Conditional Transformer Language Model for Controllable Generation](https://arxiv.org/abs/1909.05858) by Nitish Shirish Keskar*, Bryan McCann*, Lav R. Varshney, Caiming Xiong and Richard Socher.
1. **[Data2Vec](model_doc/data2vec)** (from Facebook) released with the paper [Data2Vec: A General Framework for Self-supervised Learning in Speech, Vision and Language](https://arxiv.org/abs/2202.03555) by Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, Michael Auli.
1. **[DeBERTa](model_doc/deberta)** (from Microsoft) released with the paper [DeBERTa: Decoding-enhanced BERT with Disentangled Attention](https://arxiv.org/abs/2006.03654) by Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen.
1. **[DeBERTa-v2](model_doc/deberta-v2)** (from Microsoft) released with the paper [DeBERTa: Decoding-enhanced BERT with Disentangled Attention](https://arxiv.org/abs/2006.03654) by Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen.
1. **[DeiT](model_doc/deit)** (from Facebook) released with the paper [Training data-efficient image transformers & distillation through attention](https://arxiv.org/abs/2012.12877) by Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou.
Expand Down Expand Up @@ -181,6 +182,8 @@ Flax), PyTorch, and/or TensorFlow.
| ConvBERT | ✅ | ✅ | ✅ | ✅ | ❌ |
| ConvNext | ❌ | ❌ | ✅ | ❌ | ❌ |
| CTRL | ✅ | ❌ | ✅ | ✅ | ❌ |
| Data2VecAudio | ❌ | ❌ | ✅ | ❌ | ❌ |
| Data2VecText | ❌ | ❌ | ✅ | ❌ | ❌ |
| DeBERTa | ✅ | ✅ | ✅ | ✅ | ❌ |
| DeBERTa-v2 | ✅ | ❌ | ✅ | ✅ | ❌ |
| DeiT | ❌ | ❌ | ✅ | ❌ | ❌ |
Expand Down
110 changes: 110 additions & 0 deletions docs/source/model_doc/data2vec.mdx
@@ -0,0 +1,110 @@
<!--Copyright 2022 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# Data2Vec

## Overview

The Data2Vec model was proposed in [data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language](https://arxiv.org/pdf/2202.03555) by Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu and Michael Auli.
Data2Vec proposes a unified framework for self-supervised learning across different data modalities - text, audio and images.
Importantly, predicted targets for pre-training are contextualized latent representations of the inputs, rather than modality-specific, context-independent targets.

The abstract from the paper is the following:

*While the general idea of self-supervised learning is identical across modalities, the actual algorithms and
objectives differ widely because they were developed with a single modality in mind. To get us closer to general
self-supervised learning, we present data2vec, a framework that uses the same learning method for either speech,
NLP or computer vision. The core idea is to predict latent representations of the full input data based on a
masked view of the input in a selfdistillation setup using a standard Transformer architecture.
Instead of predicting modality-specific targets such as words, visual tokens or units of human speech which
are local in nature, data2vec predicts contextualized latent representations that contain information from
the entire input. Experiments on the major benchmarks of speech recognition, image classification, and
natural language understanding demonstrate a new state of the art or competitive performance to predominant approaches.
Models and code are available at www.github.com/pytorch/fairseq/tree/master/examples/data2vec.*

Tips:

- Both Data2VecAudio and Data2VecText have been trained using the same self-supervised learning method.
In the case of Data2VecAudio, preprocessing is identical to [`RobertaModel`], including tokenization.

This model was contributed by [edugp](https://huggingface.co/edugp).
The original code can be found [here](https://github.com/pytorch/fairseq/tree/main/examples/data2vec).


## Data2VecTextConfig

[[autodoc]] Data2VecTextConfig

## Data2VecAudioConfig

[[autodoc]] Data2VecAudioConfig

## Data2VecAudioModel

[[autodoc]] Data2VecAudioModel
- forward


## Data2VecAudioForAudioFrameClassification

[[autodoc]] Data2VecAudioForAudioFrameClassification
- forward

## Data2VecAudioForCTC

[[autodoc]] Data2VecAudioForCTC
- forward

## Data2VecAudioForSequenceClassification

[[autodoc]] Data2VecAudioForSequenceClassification
- forward

## Data2VecAudioForXVector

[[autodoc]] Data2VecAudioForXVector
- forward

## Data2VecTextModel

[[autodoc]] Data2VecTextModel
- forward

## Data2VecTextForCausalLM

[[autodoc]] Data2VecTextForCausalLM
- forward

## Data2VecTextForMaskedLM

[[autodoc]] Data2VecTextForMaskedLM
- forward

## Data2VecTextForSequenceClassification

[[autodoc]] Data2VecTextForSequenceClassification
- forward

## Data2VecTextForMultipleChoice

[[autodoc]] Data2VecTextForMultipleChoice
- forward

## Data2VecTextForTokenClassification

[[autodoc]] Data2VecTextForTokenClassification
- forward

## Data2VecTextForQuestionAnswering

[[autodoc]] Data2VecTextForQuestionAnswering
- forward
1 change: 1 addition & 0 deletions docs/source/serialization.mdx
Expand Up @@ -49,6 +49,7 @@ Ready-made configurations include the following architectures:
- BART
- BERT
- CamemBERT
- Data2VecText
- DistilBERT
- ELECTRA
- GPT Neo
Expand Down