Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve MelSpectrogram librosa compatibility test #1267

Merged
merged 10 commits into from
Feb 15, 2021

Conversation

worc3131
Copy link
Contributor

@worc3131 worc3131 commented Feb 13, 2021

Resolves #1262

I've gone off piste from the steps laid out in the issue. Rather than splitting things out as a test and using parameterize, I decided to maintain more control by using unittest.subTest. I wanted to be able to generate lots of different variants, restrict them down to the relevant parameters for each component, and then remove any resultant duplicates (so as to cut down the test time), as well as maintaining the test skip on TravisCI. Trying to do all this in a decorator for parameterize was messy, so hopefully this approach is better. I'm concerned that it might just be over engineered for what you need though, and more difficult to understand than the dumb approach. Let me know what you think, I can drop back to doing the recommended way if you would like.

The change was tested by running:

(cd test && pytest torchaudio_unittest/librosa_compatibility_test.py)

Copy link
Collaborator

@mthrok mthrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So as to improve the test, we would like to run;

  1. different kinds of unit tests independently and
  2. different parameters for the same uni test independently.

The original implementation met 2 by splitting the parameters into different test methods, but did not 1 because all the tests were implemented in the same method. (note that one method executed equals one unit test)

Using the for loop inside of tests to test different parameters is what we do not want, because, when one configuration fails, then the remaining tests are not executed. subTest might make it work but it's not good enough because when querying types of tests, the set of configurations defined under one method are counted together. They are not found until executed. Parameterizing with decorator nicely solves this. In addition to that, using decorator makes the test logic and parameters clearly separated. So please do not use for loop and subTest, instead use parameterized.expand like the all other tests do.

I like your suggestion of splitting the individual tests. This can achieve the point 1. Then all you need to do is to decorate the split test with parameterized.expand. Which will look like,

@parameterized.expand([
    ...,
])
def test_spectrogram(self, ...):
   ...

@parameterized.expand([
    ...,
])
def test_melspectrogram(self, ...):
    ...

@parameterized.expand([
    ...,
])
def test_s2db(self, ...):
    ...

@parameterized.expand([
    ...,
])
def test_mfcc(self, ...):
    ...

With this approach, the test code becomes way much readable than using for loop. Also all the configuration becomes as if they are defined as different method. (they actually are)
The code for unit tests are typically maintained by those who do not have context of the library, so they have to be really simple and readable.

@worc3131
Copy link
Contributor Author

Hi @mthrok thanks for looking at this. I'm happy to go down the other route, explicit is better than implicit after all! I've written out the parameters for each test and deduplicated manually, which is a bit bulky but such is life. Having looked at unittest.subTest again this morning, it would expand out to give you the individual test results that you would like, except that it is not fully supported by pytest without a plug in, so it might be a nice tool for the future if not now. pytest-dev/pytest#1367

Copy link
Collaborator

@mthrok mthrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the update. Now the resulting code look much easier to understand.

def assert_compatibilities(self, n_fft, hop_length, power, n_mels, n_mfcc, sample_rate):

@staticmethod
def _set_up_sound():
common_utils.set_audio_backend('default')
path = common_utils.get_asset_path('sinewave.wav')
sound, sample_rate = common_utils.load_wav(path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you

  1. Instead of loading data from file, generate one?
    sample_rate = 16000
    sound = common_utils.get_sinusoid(n_channels=1, sample_rate=sample_rate)
  2. Instead of making a common method, put the above code in each test method?

These points were not included in the original issue description, but also important and I would like to take advantage of this opportunity to address them.

The first point is to make the unit test not depend of other components. When our load function has some issue, this test could also fail, which makes it harder to see what's going on when looking at log. Unit test should be simple and independent. The second point is the extension of independency. Having common method often makes new contributors think that they have to use the same method for some reasons they do not know. If it's related to the core of the test logic, it is more beneficial to make fixture common, but in this case it is not, so making each unit test independent each other gives better maintenance experience.

param(n_fft=200, hop_length=50, n_mels=128),
]
for norm in [None, 'slaney']
])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use of internal method _replace should be avoided.
We use @parameterized.expand(list(itertools.product([....], [...])) pattern for this kind of nesting.
Could you try the same?

Copy link
Contributor Author

@worc3131 worc3131 Feb 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Despite having a leading underscore, I don't believe that _replace is an internal method. parameterized.param is a collection.namedtuple. As per the python docs the leading underscore here is to prevent naming conflicts, not because it is an internal method: "In addition to the methods inherited from tuples, named tuples support three additional methods and two attributes. To prevent conflicts with field names, the method and attribute names start with an underscore." That said, my fingers did shake when I typed it out, so I've replaced it with something else. I don't think that itertools.product would work well here, as it's more suited for trying every combination (the cartesian product) of n_fft, hop_length, n_mels and mode.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your update looks good to me. I was worried that the use of _replace, even if it's not internal method, forces contributors to learn the internal of parameterized.param, which would raise the bar for writing tests, which could lead to the situation where contributors would not like to contribute anything at all. But now the final form looks good and even better than itertool approach I ended up with. Thanks.

param(n_fft=600, hop_length=100, power=2.0, n_mels=128),
param(n_fft=400, hop_length=200, power=3.0, n_mels=128),
# NOTE: Test passes offline, but fails on TravisCI (and CircleCI), see #372.
param(n_fft=200, hop_length=50, power=2.0, n_mels=128, skip_ci=True),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you verify that both test_s2db and test_mfcc fail?
If one of them does not fail on CI, we would like to run the test.
From the previous implementation, I could not tell which part of the tests were failing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the test_mfcc skip and it passed. The test_s2db skip fails as per here:

https://app.circleci.com/pipelines/github/pytorch/audio/5014/workflows/5c7c3a8d-1e26-4b07-83dd-ed4ce2d5c22f/jobs/162780

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

Copy link
Collaborator

@mthrok mthrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thank you!

param(n_fft=600, hop_length=100, power=2.0, n_mels=128),
param(n_fft=400, hop_length=200, power=3.0, n_mels=128),
# NOTE: Test passes offline, but fails on TravisCI (and CircleCI), see #372.
param(n_fft=200, hop_length=50, power=2.0, n_mels=128, skip_ci=True),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@mthrok mthrok merged commit d58ac21 into pytorch:master Feb 15, 2021
@worc3131
Copy link
Contributor Author

Perfect, thanks for all the feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve MelSpectrogram librosa compatibility test
3 participants