decode mp3 with librosa if torchaudio is > 0.12 as a temporary workaround #4923

polinaeterna · 2022-08-31T18:57:59Z

torchaudio>0.12 fails with decoding mp3 files if ffmpeg<4. currently we ask users to downgrade torchaudio, but sometimes it's not possible as torchaudio version is binded to torch version. as a temporary workaround we can decode mp3 with librosa (though it 60 times slower, at least it works)

another option would be to ask users to install the required version of ffmpeg, but is non-trivial on colab: it's not in apt packages in ubuntu 18 and conda is not preinstalled (with conda it would be easily installable)

decode with torchaudio anyway if the version of ffmpeg is correct? it's 60 times faster
tests
DO NOT FORGET to get back all the tests

see #4776 and #3663 (comment) (there is a Colab notebook to reproduce the error)

…fmpeg should be checked too)

HuggingFaceDocBuilderDev · 2022-08-31T19:06:03Z

The documentation is not available anymore as the PR was closed or merged.

lhoestq · 2022-09-09T16:38:47Z

Thanks ! Should we still support torchaudio>0.12 if it works ? And if it doesn't we can explain that downgrading is the right solution, or alternatively use librosa

polinaeterna · 2022-09-12T11:18:12Z

@lhoestq

Should we still support torchaudio>0.12 if it works ? And if it doesn't we can explain that downgrading is the right solution, or alternatively use librosa

I'm not sure here, because from the one hand, if torchaudio works - it works 60 times faster then librosa.
But from the other hand, we will get inconsistent behavior (=different results of decoding) for users of torchaudio>=0.12.
I'd better go for using librosa only to avoid inconsistency then. wdyt?

lhoestq · 2022-09-12T15:57:26Z

It seems a bit too constraining to not allow users who have a working torchaudio 0.12 setup to not use it.

If the issue is about avoiding silent errors if the decoding changes, maybe we can log which back-end is used ? It can even be a warning with performance suggestions ("you're using librosa but torchaudio 0.xx is recommended").

Note that users can still have a requirements.txt or whatever in their projects if they really want full reproducibility (and it's the bare minimum imo)

There are multiple possible back-ends so it's maybe not reasonable to only allow one back-end, especially since each back-end has installation constrains and there's no "best" back-end.

…try)

…ans)

…f them)

check if it works and fails in a single test

lhoestq

Cool thank you !

lhoestq · 2022-09-20T09:20:22Z

src/datasets/features/audio.py

+                        "`pip install librosa`. Note that decoding will be extremely slow in that case."
+                    ) from err
+                # try to decode with librosa for torchaudio>=0.12.0 as a workaround
+                logger.warning("Decoding mp3 with `librosa` instead of `torchaudio`, decoding is slow.")


Maybe we should warn this only once ? You can use warnings.warn to do that

I can't figure out how to do this. I changed logger.warning to warnings.warn here and tried setting warnings.filterwarnings("once") / warnings.simplefilter("once") but it didn't work, What am I missing?

ah lol I used warnings from logger to check if decoding was done with librosa (as they arrays have the same shapes), and now warnings from .warn are not captured in caplog in pytest, so these tests fail. maybe leave it as is?

I think it does the warning always once by default, you don't need to use an extra filter.

And I think with pytest.warns(UserWarning): should work in the test

thanks! fixed it here ebf77d7

I don't know why but for some reason in librosa case warnings are shown at each decoding anyway, I checked it on Colab, see pic. Might it be because of that librosa.load itself gives warnings on decoding: UserWarning: PySoundFile failed. Trying audioread instead.

Anyway, maybe not that important for now? Users can use warnings.filterwarnings("ignore")

Users can still mute those warnings, I don't think we can do something about it.

tests/utils.py

Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>

…datasets into workaround-torchaudio-0.12

lhoestq · 2022-09-20T13:05:38Z

Woohoo all green ! Feel free to merge if it's all good for you :)

decode mp3 with librosa if torchaudio is > 0.12 (ideally version of f…

37d9eb2

…fmpeg should be checked too)

polinaeterna added 4 commits September 1, 2022 12:45

add flake8 ignore for unused imports

48991d4

improve error mesasage

4114bd5

Merge branch 'huggingface:main' into workaround-torchaudio-0.12

e924ca1

Merge branch 'huggingface:main' into workaround-torchaudio-0.12

2e12525

polinaeterna added 21 commits September 13, 2022 14:17

Merge branch 'huggingface:main' into workaround-torchaudio-0.12

f7dbaa8

decode mp3 with torchaudio>=0.12 if it works (instead of librosa)

bd7a1ef

fix last commit

f7afaff

fix warnings

4a9da10

use datasets logging instead of standard

7fca41f

fix incorrect marks for mp3 tests (require torchaudio, not sndfile)

1305be2

add tests for latest torchaudio + separate stage in CI for it (first …

05d400a

…try)

get back unintantionally removed require_sox for mp3 tests

6e82a88

install ffmpeg in CI env to test torchaudio

883b6f5

test CI again...

d5f06ed

fix pip uninstall - add missing -y param

4fbaeb6

install ffmpeg only on ubuntu

2ee379f

try to compile old version of ffmpeg to test librosa mp3 loading

400d8f1

fix ci.yml

942d396

add some option to configure of old ffmpeg (i have no idea what it me…

b0f8bb7

…ans)

use mock to emulate torchaudio fail, add tests for librosa (not all o…

6d4d9a2

…f them)

add missing | in ci run

3f3cb53

Merge branch 'huggingface:main' into workaround-torchaudio-0.12

0748cc4

try to skip test if ffmpeg not installed

1cae272

remove ffmpeg version checking on windows ci

4665dcf

test torchaudio_latest only on ubuntu

7960700

polinaeterna added 6 commits September 19, 2022 16:08

refactor test for latest torchaudio

5f0efed

check if it works and fails in a single test

try/except decoding with librosa for file-like objects

bfe1be7

more tests for latest torchaudio, should be comlpete set now

9dd632d

remove unused decorator for ffmpeg checking

852176c

refactor ci workflow (first install, then test)

557a9cb

get back full library testing

3f63882

polinaeterna marked this pull request as ready for review September 19, 2022 18:19

polinaeterna requested a review from lhoestq September 19, 2022 18:19

lhoestq approved these changes Sep 20, 2022

View reviewed changes

polinaeterna and others added 6 commits September 20, 2022 11:45

Update tests/utils.py

a0672cc

Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>

replace logging with warnings

420f63d

Merge branch 'workaround-torchaudio-0.12' of github.com:polinaeterna/…

b4a8793

…datasets into workaround-torchaudio-0.12

Merge branch 'huggingface:main' into workaround-torchaudio-0.12

c016395

fix tests: catch warnings with a pytest context manager

ebf77d7

Merge branch 'workaround-torchaudio-0.12' of github.com:polinaeterna/…

ae67712

…datasets into workaround-torchaudio-0.12

polinaeterna changed the title ~~WIP: decode mp3 with librosa if torchaudio is > 0.12 as a temporary workaround~~ decode mp3 with librosa if torchaudio is > 0.12 as a temporary workaround Sep 20, 2022

polinaeterna merged commit 142404f into huggingface:main Sep 20, 2022

albertvillanova mentioned this pull request Sep 21, 2022

[Audio] Path of Common Voice cannot be used for audio loading anymore #3663

Closed

polinaeterna deleted the workaround-torchaudio-0.12 branch November 2, 2022 11:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

decode mp3 with librosa if torchaudio is > 0.12 as a temporary workaround #4923

decode mp3 with librosa if torchaudio is > 0.12 as a temporary workaround #4923

polinaeterna commented Aug 31, 2022 •

edited

HuggingFaceDocBuilderDev commented Aug 31, 2022 •

edited

lhoestq commented Sep 9, 2022

polinaeterna commented Sep 12, 2022 •

edited

lhoestq commented Sep 12, 2022

lhoestq left a comment

lhoestq Sep 20, 2022

polinaeterna Sep 20, 2022

polinaeterna Sep 20, 2022

polinaeterna Sep 20, 2022 •

edited

lhoestq Sep 20, 2022

polinaeterna Sep 20, 2022 •

edited

lhoestq Sep 20, 2022

lhoestq commented Sep 20, 2022

decode mp3 with librosa if torchaudio is > 0.12 as a temporary workaround #4923

decode mp3 with librosa if torchaudio is > 0.12 as a temporary workaround #4923

Conversation

polinaeterna commented Aug 31, 2022 • edited

HuggingFaceDocBuilderDev commented Aug 31, 2022 • edited

lhoestq commented Sep 9, 2022

polinaeterna commented Sep 12, 2022 • edited

lhoestq commented Sep 12, 2022

lhoestq left a comment

Choose a reason for hiding this comment

lhoestq Sep 20, 2022

Choose a reason for hiding this comment

polinaeterna Sep 20, 2022

Choose a reason for hiding this comment

polinaeterna Sep 20, 2022

Choose a reason for hiding this comment

polinaeterna Sep 20, 2022 • edited

Choose a reason for hiding this comment

lhoestq Sep 20, 2022

Choose a reason for hiding this comment

polinaeterna Sep 20, 2022 • edited

Choose a reason for hiding this comment

lhoestq Sep 20, 2022

Choose a reason for hiding this comment

lhoestq commented Sep 20, 2022

polinaeterna commented Aug 31, 2022 •

edited

HuggingFaceDocBuilderDev commented Aug 31, 2022 •

edited

polinaeterna commented Sep 12, 2022 •

edited

polinaeterna Sep 20, 2022 •

edited

polinaeterna Sep 20, 2022 •

edited