[Wav2Vec2] Fix normalization for non-padded tensors #13512

patrickvonplaten · 2021-09-10T08:37:43Z

What does this PR do?

This PR fixes a problem with normalization when the input is a list of different length that is not numpified - see: #13504

Just noticed that this bug is pretty severe actually as it affects all large-Wav2Vec2 fine-tuning :-/.
It was introduced by me in this PR: https://github.com/huggingface/transformers/pull/12804/files - I should have written more and better tests for this.

=> This means that from transformers 4.9.0 to until this PR is merged the normalization for all large Wav2Vec2 models was way off when fine-tuning the model.

@LysandreJik - do you think it might be possible to do a patched release for this?

src/transformers/models/wav2vec2/feature_extraction_wav2vec2.py

patrickvonplaten · 2021-09-10T09:48:53Z

src/transformers/models/wav2vec2/feature_extraction_wav2vec2.py

@@ -79,13 +79,20 @@ def __init__(
        self.do_normalize = do_normalize

    @staticmethod
-    def zero_mean_unit_var_norm(input_values: List[np.ndarray], input_lengths: List[int]) -> List[np.ndarray]:
+    def zero_mean_unit_var_norm(


The responsibility of retrieving the correct length from the attention mask should be in this method since input_values and attention_mask are the well-known inputs to functions in transformers

patrickvonplaten · 2021-09-10T09:49:36Z

src/transformers/models/wav2vec2/feature_extraction_wav2vec2.py

@@ -196,19 +195,33 @@ def __call__(
            return_attention_mask=return_attention_mask,
        )

-        if "attention_mask" in padded_inputs:


This part is removed/cleaned-up

patrickvonplaten · 2021-09-10T09:49:58Z

src/transformers/models/wav2vec2/feature_extraction_wav2vec2.py

@@ -172,14 +179,6 @@ def __call__(
            and (isinstance(raw_speech[0], np.ndarray) or isinstance(raw_speech[0], (tuple, list)))
        )

-        # make sure input is in list format


Currently all the padding is happening in pure python and not in numpy so let's move the numpification further down

patrickvonplaten · 2021-09-10T09:50:25Z

tests/test_feature_extraction_wav2vec2.py

@@ -134,7 +134,22 @@ def _check_zero_mean_unit_variance(input_vector):
        _check_zero_mean_unit_variance(input_values[1, :1000])
        _check_zero_mean_unit_variance(input_values[2])

-    def test_zero_mean_unit_variance_normalization_trunc(self):
+    def test_zero_mean_unit_variance_normalization(self):


Add test to make sure normalization always works as expected

patil-suraj

Great catch!

This looks good to me.

src/transformers/models/wav2vec2/feature_extraction_wav2vec2.py

tests/test_feature_extraction_wav2vec2.py

src/transformers/models/wav2vec2/feature_extraction_wav2vec2.py

LysandreJik

Seems to look good but will delegate to @patil-suraj and @anton-l's w2v2 knowledge.

Let me know once this is merged so that I may release a patch.

src/transformers/models/wav2vec2/feature_extraction_wav2vec2.py

anton-l

LGTM other than the small issues already pointed out, thanks for fixing it!

src/transformers/models/wav2vec2/feature_extraction_wav2vec2.py

src/transformers/models/speech_to_text/feature_extraction_speech_to_text.py

tests/test_feature_extraction_wav2vec2.py

tests/test_feature_extraction_speech_to_text.py

anton-l

All slow tests now pass for Wav2Vec and Hubert, nice!

patil-suraj

LGTM! Thanks for adding all those tests :)

src/transformers/models/speech_to_text/feature_extraction_speech_to_text.py

* finalize * Apply suggestions from code review * finish cleaner implementation * more tests * small fix * finish * up

finalize

c7b2630

patrickvonplaten commented Sep 10, 2021

View reviewed changes

src/transformers/models/wav2vec2/feature_extraction_wav2vec2.py Outdated Show resolved Hide resolved

patrickvonplaten commented Sep 10, 2021

View reviewed changes

src/transformers/models/wav2vec2/feature_extraction_wav2vec2.py Outdated Show resolved Hide resolved

patrickvonplaten commented Sep 10, 2021

View reviewed changes

src/transformers/models/wav2vec2/feature_extraction_wav2vec2.py Outdated Show resolved Hide resolved

patrickvonplaten added 2 commits September 10, 2021 10:42

Apply suggestions from code review

ebdd72b

finish cleaner implementation

bed55de

patrickvonplaten commented Sep 10, 2021

View reviewed changes

patrickvonplaten linked an issue Sep 10, 2021 that may be closed by this pull request

Wav2vec2Processor normalization issues on transformers 4.10.0 #13504

Closed

4 tasks

patrickvonplaten mentioned this pull request Sep 10, 2021

Wav2vec2Processor normalization issues on transformers 4.10.0 #13504

Closed

4 tasks

patrickvonplaten requested review from anton-l, LysandreJik, sgugger and patil-suraj September 10, 2021 09:54

patrickvonplaten mentioned this pull request Sep 10, 2021

Wav2Vec2 WER remains 1.00 and return blank transcriptions. #12956

Closed

patil-suraj approved these changes Sep 10, 2021

View reviewed changes

src/transformers/models/wav2vec2/feature_extraction_wav2vec2.py Outdated Show resolved Hide resolved

tests/test_feature_extraction_wav2vec2.py Outdated Show resolved Hide resolved

src/transformers/models/wav2vec2/feature_extraction_wav2vec2.py Outdated Show resolved Hide resolved

patrickvonplaten added 2 commits September 10, 2021 12:45

more tests

85620e8

small fix

8cb8554

LysandreJik approved these changes Sep 10, 2021

View reviewed changes

src/transformers/models/wav2vec2/feature_extraction_wav2vec2.py Show resolved Hide resolved

anton-l suggested changes Sep 10, 2021

View reviewed changes

src/transformers/models/wav2vec2/feature_extraction_wav2vec2.py Show resolved Hide resolved

finish

064b54b

patrickvonplaten requested review from patil-suraj and anton-l September 10, 2021 12:32

patrickvonplaten commented Sep 10, 2021

View reviewed changes

src/transformers/models/speech_to_text/feature_extraction_speech_to_text.py Show resolved Hide resolved

up

75541d9

patrickvonplaten commented Sep 10, 2021

View reviewed changes

tests/test_feature_extraction_wav2vec2.py Show resolved Hide resolved

patrickvonplaten commented Sep 10, 2021

View reviewed changes

tests/test_feature_extraction_speech_to_text.py Show resolved Hide resolved

anton-l approved these changes Sep 10, 2021

View reviewed changes

patil-suraj approved these changes Sep 10, 2021

View reviewed changes

src/transformers/models/speech_to_text/feature_extraction_speech_to_text.py Show resolved Hide resolved

patrickvonplaten merged commit d7b3b70 into huggingface:master Sep 10, 2021

patrickvonplaten deleted the fix_normalization_non_padded branch September 10, 2021 13:27

patrickvonplaten added a commit that referenced this pull request Sep 10, 2021

[Wav2Vec2] Fix normalization for non-padded tensors (#13512)

60eb416

* finalize * Apply suggestions from code review * finish cleaner implementation * more tests * small fix * finish * up

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Wav2Vec2] Fix normalization for non-padded tensors #13512

[Wav2Vec2] Fix normalization for non-padded tensors #13512

patrickvonplaten commented Sep 10, 2021 •

edited

patrickvonplaten Sep 10, 2021

patrickvonplaten Sep 10, 2021

patrickvonplaten Sep 10, 2021

patrickvonplaten Sep 10, 2021

patil-suraj left a comment

LysandreJik left a comment

anton-l left a comment

anton-l left a comment

patil-suraj left a comment

[Wav2Vec2] Fix normalization for non-padded tensors #13512

[Wav2Vec2] Fix normalization for non-padded tensors #13512

Conversation

patrickvonplaten commented Sep 10, 2021 • edited

What does this PR do?

patrickvonplaten Sep 10, 2021

Choose a reason for hiding this comment

patrickvonplaten Sep 10, 2021

Choose a reason for hiding this comment

patrickvonplaten Sep 10, 2021

Choose a reason for hiding this comment

patrickvonplaten Sep 10, 2021

Choose a reason for hiding this comment

patil-suraj left a comment

Choose a reason for hiding this comment

LysandreJik left a comment

Choose a reason for hiding this comment

anton-l left a comment

Choose a reason for hiding this comment

anton-l left a comment

Choose a reason for hiding this comment

patil-suraj left a comment

Choose a reason for hiding this comment

patrickvonplaten commented Sep 10, 2021 •

edited