Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Issue with Nokogiri #3142

Open
nirvdrum opened this issue Jun 29, 2023 · 0 comments
Open

Performance Issue with Nokogiri #3142

nirvdrum opened this issue Jun 29, 2023 · 0 comments

Comments

@nirvdrum
Copy link
Collaborator

nirvdrum commented Jun 29, 2023

We have some tests in the test suite for an internal Rails app that take an inordinate amount of time in TruffleRuby. On my developer machine a single test takes ~50s. I don't have numbers from CI yet (still working on that), but the CI machines are far less capable so I wouldn't be surprised if they take at least a minute. It's quite noticeable.

Running the tests isolated and with the CPU profiler, most of the time is spent in Nokogiri. I've tried to narrow it down to a simpler subset. Unfortunately, my simplest examples didn't have the same performance issues. So, I've left in a few dependencies:

  • bootstrap-email
    • Styles HTML emails with the Bootstrap UI toolkit
  • premailer
    • Inlines CSS rules into an HTML email body to improve client compatibility (many email clients won't load in linked stylesheets)
  • ActionMailer
    • The Rails component responsible for rendering emails. It's possible to construct an email body without this, but I found performance was worse when ActionMailer was used

I've pulled together a repo with a representative example. It's not exactly what we use in the app. In particular, the email bodies are different. But, I think it encompasses the key details and can be open sourced for public discussion and used for test cases. The performance in this repo isn't quite as bad as what I'm seeing with the app test suite. I need to investigate more to see why that is. While the reproduction isn't an exact reflection of reality, but I'm hopeful it shows enough to get us going. Processing a document shouldn't take 1s.

A confounding issue is the bootstrap-email gem compiles the Bootstrap CSS files with sassc at start-up. That process takes several seconds to complete. In CI, we end up compiling each time we perform a new run because temporary files are not maintained across test runs. Locally, the cache will be populated and used unless you explicitly remove it (rm -rf tmp/cache). It's best to keep the cache if you're profiling the Nokogiri usage but it's more representative of what CI is doing by starting in a clean state.

With TruffleRuby 23.0.0 we changed out the underlying VM with one that's more performant in various ways. That creates a bit of an interesting situation where the release build handles some performance matters for us that dev builds do not, but we lack the ability to build TruffleRuby with that VM because it's not open source. So, I'm providing numbers here from both:

TruffleRuby 23.0.0 + Oracle GraalVM

================================================================================
Email: 0
Took: 4.367758917003812s
================================================================================

================================================================================
Email: 1
Took: 0.8536989160056692s
================================================================================

================================================================================
Email: 2
Took: 0.9239389580034185s
================================================================================

================================================================================
Email: 3
Took: 0.49181870800384786s
================================================================================

TruffleRuby 23.1.0-dev (0d3058b) + GraalVM CE

================================================================================
Email: 0
Took: 10.682570666001993s
================================================================================

================================================================================
Email: 1
Took: 4.075295624999853s
================================================================================

================================================================================
Email: 2
Took: 2.1472915840058704s
================================================================================

================================================================================
Email: 3
Took: 2.8145212919989717s
================================================================================
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant