New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Wav2Vec2 & Hubert ForSequenceClassification #13153
Add Wav2Vec2 & Hubert ForSequenceClassification #13153
Conversation
…add-speech-classification # Conflicts: # src/transformers/models/hubert/configuration_hubert.py # src/transformers/models/hubert/convert_hubert_original_s3prl_checkpoint_to_pytorch.py # src/transformers/models/hubert/modeling_hubert.py # tests/test_modeling_hubert.py # utils/check_repo.py
src/transformers/models/hubert/convert_hubert_original_s3prl_checkpoint_to_pytorch.py
Show resolved
Hide resolved
@@ -122,6 +122,8 @@ | |||
"TFRagTokenForGeneration", | |||
"Wav2Vec2ForCTC", | |||
"HubertForCTC", | |||
"Wav2Vec2ForSequenceClassification", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! We have to discuss a bit with @Narsil how to best add those models to pipelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Update this list for models that are not in any of the auto MODEL_XXX_MAPPING. Being in this list is an exception and
# should **not** be the rule.
Seems like the exception has grown quite a bit :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR looks to be in a very good shape already!
Before merging it would be great if we could:
- add a
#Copied from Wav2Vec2 ...
to the HuBERT code if it's 1-to-1 the same - add one test per task for Wav2Vec2 as well
- add at least one model for each task to either https://huggingface.co/superb or facebook (let's check with others here)
- run eval of the models on the datasets to check which models should be normalized and which shouldn't and adapt configs accordingly
Accuracy evaluation on SUPERB tasks:
So far |
@patrickvonplaten everything should be ready to merge now :) |
Awesome job @anton-l ! Feel free to merge the PR whenever you want |
What does this PR do?
This adds a Hubert extension for sequence classification.
Ultimately this classification head should be compatible with s3prl
UtteranceLevel
implementation to support classification tasks from SUPERB, such as Keyword Spotting and transfer their pretrained models.Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@patrickvonplaten @patil-suraj