Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up LoadPathCache with more C #278

Closed
wants to merge 5 commits into from

Conversation

katafrakt
Copy link

@katafrakt katafrakt commented Oct 10, 2019

I decided to keep up the momentum with #277 and create a tenative pull request with my proposed changes to load path cache.

This includes:

  • Additional option to exclude some path to try them as potential load path to scan
  • New module in C to replace Dir.glob. It is both faster and easier on memory - unlike Dir.glob creating a large array and then iterating it, it yields with every found path and does path exclusion on C level.
  • Extending TravisCI matrix to test both with this new extension and without

I tested those changes against Discourse, but without significant speedup:

cold warm
no exclusions, Ruby 3.49 3.19
no exclusions, C 3.48 3.2
exclusions, Ruby 3.62 3.13
exclusions, C 3.48 3.15

However, with my internal project with a lot of javascript (webpacker etc.), caching etc. savings are significant:

cold warm
no exclusions, Ruby 9.45 0% 8.5 0%
no exclusions, C 9.02 4.5% 7.94 6.5%
exclusions, Ruby 8.49 10.1% 6.83 19.6%
exclusions, C 6.92 26.7% 6.86 19.2%

Legend:

  • cold - with load path cache purged (rm tmp/cache/bootsnap-load-path-cache)
  • warm - with load path cache present and no changes to the files

I'm not the best C programmer (or even mediocre one), so to use this new C extension one must set environment variable BOOTSNAP_EXPERIMENTAL. This way people with project in dire need to speed up can experimentally use this, while most of the people can safely use existing version.

Let me know what you think about this direction.

@katafrakt katafrakt requested a review from burke as a code owner October 10, 2019 13:17
@katafrakt
Copy link
Author

Rebased master to resolve conflict

@h0jeZvgoxFepBQ2C
Copy link

h0jeZvgoxFepBQ2C commented Sep 26, 2020

Is there any ETA when this gets merged? We would profit from this probably ❤️

@katafrakt
Copy link
Author

@h0jeZvgoxFepBQ2C this now conflicts with changes from #312. I think even with the other PR merged it would be still beneficial to have an option to exclude some dirs (node_modules, cache). I can extract it from this one and create another pull request if the maintainers are interested.

Pure C dir walker of course still would be faster than current master version, but the combination of changes in #312 with exclusion option might be enough to give significant speedup without the necessity to maintain scary chunk of C code.

@h0jeZvgoxFepBQ2C
Copy link

Is this closed since its completed? or whats the reasoning?

@casperisfine
Copy link
Contributor

The important part, which is skipping big useless directories like node_modules was implemented in #424.

As for directory scanning in C, I doesn't seem like it yields enough of a speedup to justify the increased complexity. Additionally the branch was very udated so the rebasing would have been tricky.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants