New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
activity.setStatus causes a massive performance hit during build (1s slower per 1.000 queries) #17452
Comments
Thanks for the detailed issue! This is known issue and is being tracked upstream in vadimdemedes/ink#212 Also in #15505 |
Hello thanks for those useful references and the upstream issue. My initial idea to turn off status reports entirely with an option would get rid of the Ink dependency and clean up the logs. Is this even an option as I don't find any signs of different log output levels in the sources nor do people request that feature. I found the issue with the number 3 from Kyle #3 which was open for quite some time but never implemented. Should we leave this issue open or close it as a duplicate of #15505 ? Thanks for taking your time, very much appreciated! |
…19866) * perf(gatsby-cli): throttle progress bar, build significantly faster this way This change explicitly throttles the progress bar updates (`tick` and `total`) to be clamped to 10 fps max. This drops the govbook benchmark from 210 seconds down to 140 seconds. Fixes #15505 Fixes #17452 Fixes #17966 Fixes #18801 Relates to #17873 Relates to vadimdemedes/ink#212 * Force flush on done() and use local var for total
Summary
activity.setStatus
which is used by gatsby during the build to report the page query and html page building progress introduces a performance hit in the dimension of 1s for every 1000 queries. It's like the status call costs exactly 1ms per call. This heavily builds up in environments with 10k's of queries.Demonstration:
https://github.com/georgiee/gatsby/tree/performance-test-activity/benchmarks/query
(run on my machine, 2,7 GHz Intel Core i7 13 macbook 2018)
Description
I observed a rather dramatic performance difference while using gatsby's activity tool in our internal source plugin where we process 10-20k pages at the moment. First we thought it's caused by the code block but soon we identified the status update as the cause.
This will take 8.5s on my machine during the build and is doing nothing. From this point I checked on the page query and static html output as those two things also rely on the status update.
Here the results for 10.000 pages/queries. The pages are almost empty, there are actually no page queries to run but still being processed.
.cache
is always cleared.1. With activity updates
here how it looks like (excerpt). You can see the status being updated. Each update costs around 1ms.
2. Without activity updates
(disabled in gatsby core, behind a new flag)
Here how it looks like (excerpt). No more status updates.
You can see how much faster the queries are running now. The page process is also a little bit faster but their update frequency is larger so the impact is smaller (0.5-1s faster, comapred to 8-9s faster queries).
This scales almost linear. 20.000 queries causes almost an overhead of 20 seconds.
Example/Reproduction
Here is the modified Gatsby Project. I used the benchmark project and synced it with
gatsby-dev-cli
.https://github.com/georgiee/gatsby/tree/performance-test-activity/benchmarks/query
Solution/Workarounds
Edit/Appendix
This has also an impact on
gatsby develop
(startup, refresh if enabled) of course as the output is the same for the page queries runner.The text was updated successfully, but these errors were encountered: