Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

whitenoise's scantree slows down test suite performance #558

Open
codingjoe opened this issue Jan 25, 2024 · 9 comments
Open

whitenoise's scantree slows down test suite performance #558

codingjoe opened this issue Jan 25, 2024 · 9 comments

Comments

@codingjoe
Copy link

codingjoe commented Jan 25, 2024

Python Version

3.12

Django Version

4.2

Package Version

5.3.0

Description

Hi there 馃憢 and happy 2024,

We are switching to service ESM, rather than bundling our dependencies. Mainly, because updating once dependency will invalidate the cache for a whole bundle.

While doing so, I noticed that our test suite drastically decreased in performance. The reason, whitenoise will scan the entire STATIC_ROOT directory on every single client request causing significant disk IO.

That's bearable on a MacBook Pro with 1聽GB/s disk speed and a large disk cache. On an Ubuntu worker on GitHub actions, we saw a 10x performance decrease. Taking our 8 min. test suite to a crisp 80 minutes.

I would propose moving the files dictionary from the middleware instance to the module or the thread.

The current implementation does not affect regular request service, since Middleware instances are not recycled. However, they are when you use a test client.

Until this is addressed, the only way I see to mitigate this behavior is to remove the middleware in your test suite:

# settings.py
TEST = os.getenv('TEST')
if TEST:
    # https://github.com/evansd/whitenoise/issues/558
    MIDDLEWARE.remove("whitenoise.middleware.WhiteNoiseMiddleware")

Love your thoughts!

Cheers, Joe

@merwok
Copy link

merwok commented Jan 25, 2024

In my experience, it鈥檚 usual to not add whitenoise to middleware for unit test settings.

@codingjoe
Copy link
Author

In my experience, it鈥檚 usual to not add whitenoise to middleware for unit test settings.

Interesting, in that case, I little documentation would go a long way here.

@adamchainz
Copy link
Collaborator

There鈥檚 an FAQ already on avoiding the full scan by enabling WHITENOISE_AUTOREFRESH: WhiteNoise makes my tests run slow!.

Any PRs to surface this information better are very welcome. But since this isn鈥檛 a new issue, I鈥檓 going to close it.

@adamchainz
Copy link
Collaborator

In my experience, it鈥檚 usual to not add whitenoise to middleware for unit test settings.

I would not recommend this, the more drift you introduce between test and production environments, the greater the risk for uncaught errors in that gap.

@merwok
Copy link

merwok commented Jan 25, 2024

That鈥檚 not how I read / apply that principle 馃檪

Unit tests are for testing my code, as fast as possible.

Test / prod environments are deployed servers. I validate waitress, whitenoise, dns, email sending, etc. thanks to this test environment, which is very close to prod.

@codingjoe
Copy link
Author

codingjoe commented Jan 25, 2024

Hi @adamchainz, thanks for the prompt response. I am a bit puzzled, though. The docs day the feature defaults to the DEBUG setting. Which is False in a Django test suite default and ours too. Soooo, things might not be as straightforward as they seem. As I understand the code, the auto-refresh will only bypass the instance cache. However, since there is a new instance per request, this isn't really making a dent in a test scenario. However, I only spent one day on profiling and debugging, you are the expert on this code base. Cheers! Joe

@adamchainz
Copy link
Collaborator

I鈥檓 not a huge expert, I just pattern-matched the problem to the one I鈥檝e seen before. New instances per test does sound like it will cause slowness, yes.

Yes, Django forces DEBUG to False during tests. The docs are trying to say that, by default, tests will have WHITENOISE_AUTOREFRESH set to False when it should really be True.

If you have any ideas for improvements, please do open a PR. Perhaps we should reopen this issue too...

codingjoe added a commit to codingjoe/whitenoise that referenced this issue Feb 11, 2024
@codingjoe
Copy link
Author

Hi @adamchainz,

I took a deep dive into the middleware life cycles in Django's test suite and pytest-django. I can confirm, each client will instantiate a new middleware. However, not on each call. See also:
https://github.com/django/django/blob/bc8471f0aac8f0c215b9471b594d159783bac19b/django/test/client.py#L169-L173

I opened a pull-request #562 with a test, that demonstrate the issue. The issue is resolved when the test doesn't fail.

So, yes, @adamchainz, I think it's best to reopen the issue. I will see if I can't find the time to cook up a patch.

Cheers!
Joe

@adamchainz adamchainz reopened this Feb 11, 2024
@adamchainz
Copy link
Collaborator

Great, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants