Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Wav2Vec2 & Hubert ForSequenceClassification #13153

Merged
merged 11 commits into from Aug 27, 2021
Merged

Add Wav2Vec2 & Hubert ForSequenceClassification #13153

merged 11 commits into from Aug 27, 2021

Conversation

anton-l
Copy link
Member

@anton-l anton-l commented Aug 17, 2021

What does this PR do?

This adds a Hubert extension for sequence classification.
Ultimately this classification head should be compatible with s3prl UtteranceLevel implementation to support classification tasks from SUPERB, such as Keyword Spotting and transfer their pretrained models.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@patrickvonplaten @patil-suraj

…add-speech-classification

# Conflicts:
#	src/transformers/models/hubert/configuration_hubert.py
#	src/transformers/models/hubert/convert_hubert_original_s3prl_checkpoint_to_pytorch.py
#	src/transformers/models/hubert/modeling_hubert.py
#	tests/test_modeling_hubert.py
#	utils/check_repo.py
@anton-l anton-l changed the base branch from master to hubert-test August 25, 2021 10:59
@anton-l anton-l changed the base branch from hubert-test to master August 25, 2021 11:00
@@ -122,6 +122,8 @@
"TFRagTokenForGeneration",
"Wav2Vec2ForCTC",
"HubertForCTC",
"Wav2Vec2ForSequenceClassification",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! We have to discuss a bit with @Narsil how to best add those models to pipelines

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# Update this list for models that are not in any of the auto MODEL_XXX_MAPPING. Being in this list is an exception and
# should **not** be the rule.

Seems like the exception has grown quite a bit :)

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks to be in a very good shape already!
Before merging it would be great if we could:

  • add a #Copied from Wav2Vec2 ... to the HuBERT code if it's 1-to-1 the same
  • add one test per task for Wav2Vec2 as well
  • add at least one model for each task to either https://huggingface.co/superb or facebook (let's check with others here)
  • run eval of the models on the datasets to check which models should be normalized and which shouldn't and adapt configs accordingly

@anton-l
Copy link
Member Author

anton-l commented Aug 27, 2021

Accuracy evaluation on SUPERB tasks:

  • KS has uniform-length samples, so no padding
  • ER has non-uniform padded batches
  • SID is evaluated with batch_size=1 as in s3prl
Task Model normalize=True normalize=False Paper
KS Wav2Vec2-base 0.9627 0.9643 0.9623
Hubert-base 0.9669 0.9672 0.9630
ER Wav2Vec2-base 0.5281 0.6258 0.6343
Hubert-base 0.5502 0.6359 0.6492
SID Wav2Vec2-base 0.7360 0.7518 0.7518
Hubert-base 0.8071 0.8071 0.8142

So far normalize=False is always better, as expected (s3prl never used normalization during eval).
There's also some slight variation with the official results, but it's of the same magnitude as s3prl vs paper results.

@anton-l anton-l changed the title [WIP] Add HubertForSequenceClassification [WIP] Add Wav2Vec2 & Hubert ForSequenceClassification Aug 27, 2021
@anton-l
Copy link
Member Author

anton-l commented Aug 27, 2021

  • Passed integration test for all 4 tasks on both models
  • Added Copied from where possible (the script just inserts a full copy of W2V2.forward() before End copy, so I didn't use it there)
  • Added dummy examples to forward() docs
  • Moved the models to https://huggingface.co/superb

@patrickvonplaten everything should be ready to merge now :)

@anton-l anton-l changed the title [WIP] Add Wav2Vec2 & Hubert ForSequenceClassification Add Wav2Vec2 & Hubert ForSequenceClassification Aug 27, 2021
@patrickvonplaten
Copy link
Contributor

Awesome job @anton-l ! Feel free to merge the PR whenever you want

@anton-l anton-l merged commit b6f332e into huggingface:master Aug 27, 2021
@anton-l anton-l deleted the add-speech-classification branch September 8, 2021 21:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants