Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression while requiring lots of files #84

Closed
redox opened this issue Aug 4, 2015 · 37 comments
Closed

Performance regression while requiring lots of files #84

redox opened this issue Aug 4, 2015 · 37 comments
Milestone

Comments

@redox
Copy link

redox commented Aug 4, 2015

Hi guys,

I'm following rails/sprockets-rails#225 and I would like to share with you a performance regression I'm experiencing after moving from sprocket-rails 2.1.3 to 3.0 or master.

It turns out that the issue is not in sprocket-rails but rather in the underlying sprockets since most of the time is spent resolving asset paths (cc @eileencodes).

Here is what I'm seeing on my personal app (ruby-prof output):

Measure Mode: wall_time
Thread ID: 70253745167340
Fiber ID: 70253756137300
Total: 1.518107
Sort by: self_time

 %self      total      self      wait     child     calls  name
 19.84      0.302     0.301     0.000     0.000     1368   <Module::Marshal>#load_without_autoloading
  5.71      0.087     0.087     0.000     0.000      143   Zlib::Inflate#inflate
  4.60      0.070     0.070     0.000     0.000     1339   <Class::File>#utime
  4.22      0.074     0.064     0.000     0.010     1908   Sprockets::URIUtils#join_file_uri
  3.72      0.448     0.056     0.000     0.392     1339   Sprockets::EncodingUtils#unmarshaled_deflated
  3.58      0.054     0.054     0.000     0.000    21137   String#scan
  2.48      0.222     0.038     0.000     0.184    20937   Sprockets::Resolve#parse_path_extnames
  2.16      0.033     0.033     0.000     0.000     4095   <Class::File>#exist?
  2.03      0.031     0.031     0.000     0.000        2   SQLite3::Statement#step
  1.77      0.055     0.027     0.000     0.029    20939   Array#reverse_each
  1.75      0.048     0.027     0.000     0.021      302   Array#include?
  1.74      0.028     0.026     0.000     0.002     7471  *String#gsub
  1.63      0.025     0.025     0.000     0.000     1370   File#initialize
  1.46      0.022     0.022     0.000     0.000   120894   String#==
  1.42      0.052     0.022     0.000     0.031     1831   Sprockets::DigestUtils#digest
  1.38      0.021     0.021     0.000     0.000    21201   <Class::File>#basename
  1.32      0.020     0.020     0.000     0.000       20   <Class::Dir>#[]
  1.29      0.020     0.020     0.000     0.000     3203   Regexp#===
  1.28      0.161     0.019     0.000     0.141    20937   Sprockets::PathUtils#match_path_extname

screen shot 2015-08-01 at 13 41 11

And here is a small project reproducing the issue: redox/perf_regression_sprockets. In that application a page that took 20ms to be rendered using sprockets-rails 2.1.3 now takes 60ms with the latest master branch.

Any chance it could be related to the "compat" resolving layer? Let me know if I can help!

@rafaelfranca rafaelfranca added this to the 3.x milestone Aug 13, 2015
@mib32
Copy link

mib32 commented Sep 7, 2015

I've got exactly same problem.

@mib32
Copy link

mib32 commented Sep 7, 2015

Here is my vanilla rails app, with 100 .coffee files. It renders at average of 900ms in development environment. Is it supposed to be that long?
https://github.com/mib32/slow-sprockets-example

@jorisvh
Copy link

jorisvh commented Nov 18, 2015

Exactly the same issue, upgrading from sprockets 2.12 (70ms) to 3.4 (900ms).

@schneems
Copy link
Member

I finally got a spare moment to try and repro this. I used the example app from @mib32. I ran with sprockets 2 and then with sprockets 3. The results will SHOCK you (read in news reporter voice).

Sprockets 2

$ bundle exec rake assets:clobber; time env RAILS_ENV=production bundle exec rake assets:precompile
I, [2015-11-19T09:42:01.013358 #4944]  INFO -- : Removed /Users/richardschneeman/Documents/projects/tmp/slow-sprockets-example/public/assets
rm -rf /Users/richardschneeman/Documents/projects/tmp/slow-sprockets-example/tmp/cache/assets
I, [2015-11-19T09:42:18.361928 #4946]  INFO -- : Writing /Users/richardschneeman/Documents/projects/tmp/slow-sprockets-example/public/assets/application-91938166bf4ab99d4165669cd7fb7fbb.js
I, [2015-11-19T09:42:18.427565 #4946]  INFO -- : Writing /Users/richardschneeman/Documents/projects/tmp/slow-sprockets-example/public/assets/application-7fa3632c92a28b60e30f9b6b6bf8faf8.css

real    0m17.444s
user    0m13.438s
sys    0m2.944s

Sprockets 3

$ bundle exec rake assets:clobber; time env RAILS_ENV=production bundle exec rake assets:precompile
I, [2015-11-19T09:43:06.891215 #5245]  INFO -- : Removed /Users/richardschneeman/Documents/projects/tmp/slow-sprockets-example/public/assets
rm -rf /Users/richardschneeman/Documents/projects/tmp/slow-sprockets-example/tmp/cache/assets
I, [2015-11-19T09:43:12.434402 #5248]  INFO -- : Writing /Users/richardschneeman/Documents/projects/tmp/slow-sprockets-example/public/assets/application-d879645f2bbe4579d56062dc22ac18b86d5f5e10759eedfada4789a09614cfe4.js
I, [2015-11-19T09:43:12.492377 #5248]  INFO -- : Writing /Users/richardschneeman/Documents/projects/tmp/slow-sprockets-example/public/assets/application-0723cb9a2dd5a514d954f70e0fe0b89f6f9f1ae3a375c182f43b5f2b57e9c869.css

real    0m5.623s
user    0m4.654s
sys    0m0.768s

So sprockets 3 is about 3x faster than sprockets 2 here with no cache. That's the opposite of what's reported. I'll give the code provided by @redox a shot.

@schneems
Copy link
Member

I'm seeing the same thing with the other code from @redox sprockets 3 is faster than 2 (but not by quite the same difference). How did you instrument your app with ruby-prof?

@mib32
Copy link

mib32 commented Nov 19, 2015

@schneems Hello Richard, thank you for taking your time on that thing.

I see that you are measuring time that rake assets:precompile takes. But i personally report about regression of Page load time.

Precompilation and page rendering are not anyhow connected, right? Please correct me if I am getting something wrong.

@eileencodes
Copy link
Member

We might be seeing this - noticed that on first load changing any file takes forever to load. Haven't had a chance to look into it yet.

@schneems
Copy link
Member

The same machinery that powers assets:precompile is running behind the scenes of the middleware. We still have to compile your asset, we just do it on the fly. I was worried we had a regression for the whole thing. Maybe the regression is somewhere in the middleware that serves the request.

I'm less familiar with profiling in a live request and hoping I could repro in a shell script. Tips on debugging would be nice. Like how you got that ruby-prof graph?

@mib32
Copy link

mib32 commented Nov 19, 2015

@schneems The graph in the top is not mine, but i had the same results, when i profiled literally this way in application.html.haml:

- RubyProf.start
= javascript_include_tag ...
= stylesheet_include_tag ...
- RubyProf.stop

When i went through a profile, i saw that bottleneck is in the part of code that figures out the filename of a n asset and gives it to renderer. But i couldn't investigate further due to lack of ruby knowledge.

I would like to point out again that it is only in development.

@mib32
Copy link

mib32 commented Nov 19, 2015

@eileencodes Yep. Especially if you use @imports. But i think that this is another issue.

@daniel-rikowski
Copy link
Contributor

Could you perhaps share your Ruby and OS version numbers?

I might be possible certain Ruby versions on certain platforms in combination with a large number of gems (i.e. a large load path) result in such a performance regression. (For reference: https://bugs.ruby-lang.org/issues/10015)

@jorisvh
Copy link

jorisvh commented Nov 20, 2015

ruby 2.2.1p85 (2015-02-26 revision 49769) [x86_64-darwin14]
on OS X El Capitan (10.11.1)

@daniel-rikowski
Copy link
Contributor

@jorisvh If I understand correctly you should be affected by that Ruby problem, but it still would not account for that drastic difference between 70ms and 900ms.

Are you by any chance comparing debug and non-debug mode? Because that way I get similar numbers.

This is what I measured at my workstation (Ruby 2.2.3, Windows 10) using https://github.com/mib32/slow-sprockets-example

mode sprockets 2.12.4 sprockets 3.3.4
debug enabled 500 ms 1100ms
debug disabled 50 ms 80 ms
initial compilation 58 s 8 s (!)

@daniel-rikowski
Copy link
Contributor

@mib32 Are you getting about 70 ms with sprockets 2 in debug mode, too? That's where I see render times of 500 ms... 🐌 (Might be a Windows issue, though)

Just to be sure: I'm talking about asset debug mode, i.e. config.assets.debug, not config.cache_classes or config.eager_load.

@mib32
Copy link

mib32 commented Nov 20, 2015

@daniel-rikowski Yeah... You seems to be right, i am unable to see any significant regression no more...

@seanlinsley
Copy link

Here's a flamegraph visualization from my app: (rawgit, gist)

screen shot 2015-11-20 at 12 17 27 pm

The dark blue section taking up 60% of the ~500 ms page load time is Sprockets, despite the fact that multiple requests have already happened, so Sprockets should have all the info it needs to serve the request.

This part of the app is inside Active Admin, which has 11 CoffeeScript files and 41 SCSS files. Our customizations on top of that add 6 CoffeeScript files and 1 SCSS file.

These are the asset settings I have configured for development:

  config.assets.debug = true

  config.assets.digest = true

  config.assets.raise_runtime_errors = true

As expected I'm not seeing this slowdown in production, because there the files are already precompiled.

@seanlinsley
Copy link

I can reproduce with the test app that Active Admin generates, as well. I can give steps to set it up if needed.

@mib32
Copy link

mib32 commented Nov 20, 2015

@seanlinsley Could you please answer couple of questions:

  • What version of sprockets are you using?
  • Do you see that slowdown in version 2.14? Could you present graphs for 2.14?

@tyler-vincent
Copy link

We're seeing this slowdown as well on a large app that was just upgraded from Rails 3.2 to 4.2 with config.assets.debug = true. Oddly the CSS compilation doesn't seem to affect performance much because we're using SASS and @import statements to create one application.css file.

The JS file calls murder the speed though. It seems like each 304 return from the server is taking a really long time which adds up to a long page load.

For now we've sped things up a bit by using our own modified version of this:
https://github.com/discourse/discourse/blob/master/lib/middleware/turbo_dev.rb
The solution feels real sketchy though.

@redox
Copy link
Author

redox commented Jan 17, 2016

@mib32 did you manage to reproduce something?

@danielpclark
Copy link

danielpclark commented Jun 4, 2016

I did a big sprockets upgrade and it didn't break anything. I'm very happy about this. 😄

Dev environment details: I'm experiencing a page loading performance drop after upgrading from sprockets 2.8 to 3.6 (after reading the Speeding up Sprockets blog post). There were a few dependencies linked into it so I ended up upgrading much more than just sprockets.

Notably where I was specifically looking for improvement is with my layout/application template, as it was the slowest resource.

Resource Before upgrade After upgrade
layout/application 9ms 48ms
JavaScript 160ms 180ms
CSS 70ms 100ms

The only other notable changes I did for the assets with the upgrade is change the file extensions from .js.coffee to .coffee and .css.scss to .scss. I have not created any manifests to help with the assets (after reading the documentation I haven't grasped the usefulness of this or the howto of it). Perhaps the manifest is my missing piece for performance?

I've noticed my flame graph showed more time being spent in sprockets as well.

I realize that this may be because of some side effect from the other gems. I'll do some more testing to determine this. And I'll get back to you.

@danielpclark
Copy link

danielpclark commented Jun 9, 2016

Alright I did some digging and here's my derailed stackprof of slow methods

before

Booting: development
Endpoint: "/"
       user     system      total        real
100 requests 26.830000   1.780000  28.610000 ( 28.866952)
Running `stackprof tmp/2016-06-09T00:42:10-04:00-stackprof-cpu-myapp.dump`. Execute `stackprof --help` for more info
==================================
  Mode: cpu(1000)
  Samples: 7184 (0.03% miss rate)
  GC: 1013 (14.10%)
==================================
     TOTAL    (pct)     SAMPLES    (pct)     FRAME
      1894  (26.4%)        1894  (26.4%)     Pathname#chop_basename
      1104  (15.4%)         937  (13.0%)     Hike::Index#entries
      1314  (18.3%)         427   (5.9%)     BetterErrors::ExceptionExtension#set_backtrace
       391   (5.4%)         386   (5.4%)     Sprockets::Mime#mime_types
      1466  (20.4%)         305   (4.2%)     Pathname#plus
      1384  (19.3%)         272   (3.8%)     BindingOfCaller::BindingExtensions#callers
       402   (5.6%)         205   (2.9%)     Hike::Index#find_aliases_for
       373   (5.2%)         175   (2.4%)     Hike::Index#sort_matches
      1628  (22.7%)         162   (2.3%)     Pathname#+
       234   (3.3%)         117   (1.6%)     ActionView::PathResolver#find_template_paths
       126   (1.8%)         101   (1.4%)     Hike::Index#build_pattern_for
       206   (2.9%)          98   (1.4%)     Hike::Index#pattern_for
      1696  (23.6%)          93   (1.3%)     Hike::Index#match
        65   (0.9%)          65   (0.9%)     Hike::Trail#stat
       118   (1.6%)          63   (0.9%)     Hike::Index#initialize
      2454  (34.2%)          62   (0.9%)     Pathname#join
        59   (0.8%)          59   (0.8%)     Time#compare_with_coercion
        57   (0.8%)          52   (0.7%)     ActiveSupport::FileUpdateChecker#watched
      5345  (74.4%)          52   (0.7%)     Sprockets::Rails::Helper#check_errors_for
       760  (10.6%)          47   (0.7%)     Pathname#relative?
        45   (0.6%)          30   (0.4%)     Sprockets::Engines#deep_copy_hash
       131   (1.8%)          25   (0.3%)     ActiveSupport::FileUpdateChecker#max_mtime
        88   (1.2%)          21   (0.3%)     Sprockets::Asset#dependency_fresh?
        39   (0.5%)          18   (0.3%)     ActiveSupport::Inflector#camelize
        18   (0.3%)          18   (0.3%)     String#blank?
        18   (0.3%)          18   (0.3%)     ActionView::Helpers::AssetUrlHelper#compute_asset_extname
        17   (0.2%)          17   (0.2%)     ActiveSupport::SafeBuffer#initialize
        82   (1.1%)          16   (0.2%)     ActiveRecord::Migrator.migrations
       108   (1.5%)          14   (0.2%)     ActionView::Helpers::AssetUrlHelper#asset_path
        14   (0.2%)          14   (0.2%)     ThreadSafe::NonConcurrentCacheBackend#[]

after

Booting: development
Endpoint: "/"
       user     system      total        real
100 requests 34.960000   2.740000  37.700000 ( 37.947141)
Running `stackprof tmp/2016-06-09T00:05:11-04:00-stackprof-cpu-myapp.dump`. Execute `stackprof --help` for more info
==================================
  Mode: cpu(1000)
  Samples: 9465 (0.00% miss rate)
  GC: 1432 (15.13%)
==================================
     TOTAL    (pct)     SAMPLES    (pct)     FRAME
      1261  (13.3%)        1003  (10.6%)     Sprockets::PathUtils#entries
       702   (7.4%)         702   (7.4%)     URI::RFC3986_Parser#split
       611   (6.5%)         611   (6.5%)     Sprockets::PathUtils#path_extnames
       600   (6.3%)         600   (6.3%)     Sprockets::PathUtils#stat
       669   (7.1%)         399   (4.2%)     Sprockets::URITar#initialize
       341   (3.6%)         341   (3.6%)     URI::RFC2396_Parser#escape
      1164  (12.3%)         283   (3.0%)     Sprockets::PathUtils#match_path_extname
       270   (2.9%)         270   (2.9%)     Sprockets::Paths#root
      1419  (15.0%)         235   (2.5%)     Set#initialize
       225   (2.4%)         225   (2.4%)     Set#add
       187   (2.0%)         187   (2.0%)     URI::RFC2396_Parser#unescape
       184   (1.9%)         167   (1.8%)     Sprockets::DigestUtils#digest
       151   (1.6%)         151   (1.6%)     #<Module:0x0000000198d748>.load_with_autoloading
       453   (4.8%)         148   (1.6%)     Set#merge
       950  (10.0%)         147   (1.6%)     Sprockets::URIUtils#split_file_uri
      1163  (12.3%)         140   (1.5%)     Sprockets::Resolve#parse_path_extnames
      7286  (77.0%)         118   (1.2%)     Sprockets::CachedEnvironment#initialize
       116   (1.2%)         116   (1.2%)     Sprockets::PathUtils#split_subpath
       439   (4.6%)         114   (1.2%)     Sprockets::URIUtils#join_file_uri
       218   (2.3%)         109   (1.2%)     ActionView::PathResolver#find_template_paths
       153   (1.6%)         104   (1.1%)     Sprockets::HTTPUtils#find_q_matches
       129   (1.4%)         100   (1.1%)     Sprockets::Mime#extname_map
       188   (2.0%)          98   (1.0%)     Sprockets::URITar#expand
      2024  (21.4%)          95   (1.0%)     Sprockets::Dependencies#resolve_dependency
        89   (0.9%)          89   (0.9%)     Sprockets::PathUtils#absolute_path?
       719   (7.6%)          59   (0.6%)     Sprockets::CachedEnvironment#stat
        68   (0.7%)          57   (0.6%)     ActiveSupport::FileUpdateChecker#watched
        52   (0.5%)          52   (0.5%)     Time#compare_with_coercion
       195   (2.1%)          44   (0.5%)     Sprockets::EncodingUtils#unmarshaled_deflated
        39   (0.4%)          39   (0.4%)     block (2 levels) in <class:Numeric>

So my app is faster with the older sprockets: (from flame graph info of page load)

before

sprockets-rails-2.3.3 (431 samples - 57.09%)
sprockets-2.12.4 (417 samples - 55.23%)

after

sprockets-rails-2.3.3 (610 samples - 63.34%)
sprockets-3.6.0 (601 samples - 62.41%)

It looks like way more sprocket method calls are being done and the app is spending more time in it.

@schneems
Copy link
Member

This is a really long thread. I'm working on performance again right now. Did anyone include any slow example apps on this thread?

@redox
Copy link
Author

redox commented Jun 20, 2016

This is a really long thread. I'm working on performance again right now. Did anyone include any slow example apps on this thread?

You can use https://github.com/redox/perf_regression_sprockets -> I tried to make it pretty straight forward. Let us know how that goes @schneems 👍

@nfm
Copy link

nfm commented Jun 28, 2016

I've just run into this issue too on a newly added, asset heavy page on our site.

Here's my current reproduction: https://github.com/nfm/sprockets-3.x-performance-regressions

The meat of it is this super simple rake task:

namespace :sprockets do
  task benchmark: :environment do
    asset_count = 1500
    puts "Creating #{asset_count} empty assets..."
    asset_count.times { |i| `touch #{Rails.root}/app/assets/images/#{i}.png` }
    puts "Done"

    puts "Precompiling assets..."
    Rake.application.invoke_task "assets:precompile"
    puts "Done"

    puts "Running benchmark..."

    start = Time.now
    count = 1500
    count.times { ActionController::Base.helpers.asset_path('1.png') }
    puts "#{count} lookups of the same asset took #{(Time.now - start) / 1.second}"

    start = Time.now
    count = 1500
    count.times { |i| ActionController::Base.helpers.asset_path("#{i}.png") }
    puts "#{count} lookups of #{count} different assets took #{(Time.now - start) / 1.second}"
  end
end

Otherwise it's just a newly initialized Rails 4.2.6 app, with two branches: one for Sprockets 2.x and one for Sprockets 3.x.

For me, on Sprockets 3.x branch:

1500 lookups of the same asset took 6.038463876
1500 lookups of 1500 different assets took 6.059824304

And on Sprockets 2.x branch:

1500 lookups of the same asset took 0.069654396
1500 lookups of 1500 different assets took 0.064596837

It seems to be related to the number of assets that Sprockets is managing - nothing to do with their content, or whether lots of them are looked up or not. I would have expect to see O(1) lookup times for any number of assets in the repo, but for some reason 3.x is two orders of magnitude slower than 2.x.

@schneems
Copy link
Member

@redox Not to be too over zealous but I think I already fixed the performance regression in Sprockets 3. When I use Sprockets 3.2.0 with your app I see a page load take about 400ms. (using rack-mini-profiler) When I use Sprockets 3.6.2 it takes about 100ms. Can you try upgrading and let me know how it looks.

@nfm i haven't looked at your app yet, it's already using Sprockets 3.6.2. I'll need to dig in more.

@tyler-vincent
Copy link

tyler-vincent commented Jun 28, 2016

We noticed a large performance improvement upgrading to sprockets-rails 3.x from 2.x. It's not perfect but way better than before. We did also go to sprockets 3.6 from 3.5.2 but I believe it was sprockets-rails that fixed the issue

schneems added a commit that referenced this issue Jun 28, 2016
The method `match_path_extname` is expensive. We know that if the name of the file on disk does not start with the same basename that we're looking for then it won't matter what extensions it's got. This check is pretty fast.

On the example app

https://github.com/nfm/sprockets-3.x-performance-regressions

It reduced asset lookup from 15 seconds to 2.6 seconds. It's still not perfect but it's a 400% + speed increase in the case of looking for an asset in a huge directory.

Related comment #84 (comment)
schneems added a commit that referenced this issue Jun 28, 2016
The method `match_path_extname` is expensive. We know that if the name of the file on disk does not start with the same basename that we're looking for then it won't matter what extensions it's got. This check is pretty fast.

On the example app

https://github.com/nfm/sprockets-3.x-performance-regressions

It reduced asset lookup from 15 seconds to 2.6 seconds. It's still not perfect but it's a 400% + speed increase in the case of looking for an asset in a huge directory.

Related comment #84 (comment)
@schneems
Copy link
Member

@nfm I found one place where it was spending a bunch of time, thanks for the example app. It's not back to 2.x speeds but I was able to get a 400% perf improvement with #331

There is some logic that was added to 3.x that will allow for multiple assets to be matched. It's expensive in large directories. I don't know why it was added and I don't feel comfortable tearing it out without more context. So after that patch gets green and is merged to master I can backport to 3.x and cut a release. It is still slower than 2.x but at least it's less slow.

@schneems
Copy link
Member

Turns out it's perhaps faster, but not by 400% made a mistake while benchmarking.

schneems added a commit that referenced this issue Jun 28, 2016
The method `match_path_extname` is expensive. We know that if the name of the file on disk does not start with the same basename that we're looking for then it won't matter what extensions it's got. This check is pretty fast.

On the example app

https://github.com/nfm/sprockets-3.x-performance-regressions

It reduced asset lookup from 15 seconds to 2.6 seconds. It's still not perfect but it's a 400% + speed increase in the case of looking for an asset in a huge directory.

Related comment #84 (comment)
@nfm
Copy link

nfm commented Jun 29, 2016

@schneems thanks for taking a look :)

Interesting thought about things being slow if there are many assets in one directory. I've updated my repo to add another benchmark where each asset is in its own directory. Probably the most realistic use-case is having a few directories with a reasonable number of files (rather than one directory with many files, or many directories with one file), but I think these two benchmarks illustrate the problem.

Here's my latest benchmarks:

Sprockets 2.x:

# ./bin/rake sprockets:benchmark
1500 lookups of the same asset took 0.086173673
1500 lookups of 1500 different assets took 7.336354235
# ./bin/rake sprockets:benchmark_dirs
1500 lookups of the same asset took 0.043988341
1500 lookups of 1500 different assets took 1.415721443
# ./bin/rake sprockets:benchmark PRECOMPILE=1
1500 lookups of the same asset took 0.109657337
1500 lookups of 1500 different assets took 0.10572148
# ./bin/rake sprockets:benchmark_dirs PRECOMPILE=1
1500 lookups of the same asset took 0.094545266
1500 lookups of 1500 different assets took 0.087344999

Sprockets 3.x:

# ./bin/rake sprockets:benchmark
1500 lookups of the same asset took 7.827039138
1500 lookups of 1500 different assets took 6.186335159
# ./bin/rake sprockets:benchmark_dirs
1500 lookups of the same asset took 1.993022409
1500 lookups of 1500 different assets took 0.587485463
# ./bin/rake sprockets:benchmark PRECOMPILE=1
1500 lookups of the same asset took 6.232709039
1500 lookups of 1500 different assets took 6.381696028
# ./bin/rake sprockets:benchmark_dirs PRECOMPILE=1
1500 lookups of the same asset took 0.729864224
1500 lookups of 1500 different assets took 0.631234396

So it does look like the directory structure makes a significant difference without precompiling, which intuitively feels right. But the interesting discrepancy is the difference the directory structure makes after precompiling in sprockets-2.x vs sprockets-3.x. The performance should be the same for both directory structures, and it is in sprockets-2.x, but in sprockets-3.x many files in one directory is much slower. Something weird is going on!

@nfm
Copy link

nfm commented Jun 29, 2016

Ok, so it looks like most of the weirdness is me misunderstanding Sprockets!

Precompiling the assets is not enough to have the benchmark behave like Sprockets will in production. The benchmark was running in the development environment where config.assets.debug is set to true, and as a result assets were being looked up by Sprockets::Rails::HelperAssetResolvers::Environment instead of Sprockets::Rails::HelperAssetResolvers::Manifest - the precompiled assets seem to be ignored.

Our issue in production turned out to be that config.assets.compile was accidentally set to true. This doesn't default to false in production; you do need to explicitly set it to false in config/environments/production.rb (Rails generates this config for you, but we had inadvertently removed it thinking it was redundant).

Our manifest was being used to look up assets, but some asset paths were being provided without file extensions (eg. image_path('some-image') instead of image_path('some-image.png')), which was causing them to not be found in the manifest, and they were falling back to being compiled at request time.

If you're having performance issues, my take-away is to check Rails.application.config.assets in production, and confirm the following settings:

  • compile should be set to false. This isn't done by default - it has to be specified in your config/environments/production.rb.
  • debug should be set to false. This is done by default, so it will only be true if you've added it to your config/environments/production.rb, which is pretty unlikely.
  • digest should be set to true. This is done by default, so it will only be false if you've added it to your config/environments/production.rb, which is pretty unlikely.
  • resolve_with should be set to [:manifest]. If compile, debug, and digest are configured correctly, sprockets-rails will configure this correctly for you.

@schneems
Copy link
Member

schneems commented Jul 1, 2016

@MFM yes, keeping compile=true is a big hit on performance. I added a note to an issue #327 (comment) that could have prevented your case from happening, if we raised when you're looking up an asset without an extension. I was able to successfully use your test script to make quite a few improvements in master and one on the 3.x branch. Thank you very much for taking the time to write the script and to share it.

On the script: i noticed the manifest behavior of your script with sprockets 2.x and removed it in my local tests (basically if you have a manifest file sprockets 2.x will use that skipping the asset lookup logic which is why it was so fast).Your script did help me find a performance regression and ultimate fix I have a PR #336 that will drastically improve performance.

That being said there is one core difference in behavior between 3.x and 2.x. Sprockets 2.x caches individual asset lookups so when you do the second lookup you hit a cache. Sprockets 3.x does not do this but rather it caches directory entries instead. I don't know why that decision was made. It is true that even with the #336 patch applied that it is still not as fast as 2.x in repeatedly looking up the same asset, however in the second "random" asset lookup test it is actually faster than 2.x. I want to play with the format that we use to store the entries, as a hash lookup could prove to be much faster than looping through all the entries. Until I get the time to do that you should see improved speed in development with #336. When tests are green I plan on merging and cutting a release.

@schneems
Copy link
Member

schneems commented Jul 1, 2016

I cut a release 3.6.3 please try it out. Going to close this thread for now. Thank you all for providing feedback and example scripts.

@schneems schneems closed this as completed Jul 1, 2016
@nfm
Copy link

nfm commented Jul 3, 2016

Thanks @schneems, that all makes sense. Nice work with the performance fix! Glad to be able to help out :)

@PikachuEXE
Copy link

I just reran rake sprockets:benchmark 3 times per branch
with latest sprockets (see versions in headings below)

sprockets 2.12.4

1500 lookups of the same asset took 0.238217
1500 lookups of 1500 different assets took 9.4763

1500 lookups of the same asset took 0.254395
1500 lookups of 1500 different assets took 9.344158

1500 lookups of the same asset took 0.25907
1500 lookups of 1500 different assets took 9.132838

sprockets 3.7.0

1500 lookups of the same asset took 9.655606
1500 lookups of 1500 different assets took 2.507665

1500 lookups of the same asset took 5.900586
1500 lookups of 1500 different assets took 3.093761

1500 lookups of the same asset took 6.441615
1500 lookups of 1500 different assets took 2.9105

I hope "lookups of the same asset" can take less time :(

@schneems
Copy link
Member

schneems commented Nov 1, 2016

So here's the deal. Sprockets 3 is MUCH faster at looking up an individual non-cached asset because of the way caching changed between 2 and 3. In Sprockets 2 the individual assets are cached, which is not all that useful unless you're using the same asset a ton. In that case it is much much faster in sprockets 3. However if you need different assets, it is really slow. In Sprockets 3 we cache entries in directories instead of the individual assets. This is faster for multi asset lookups. Basically a non-cached asset lookup on Sprockets 3 is faster than on Sprockets 2.

If we wanted that speed back for the same asset we would have to add another layer of caching back to Sprockets 3 which would actually decrease the performance of looking up a non-duplicate asset call as we're having to add a cache check that will miss frequently. Invalidating the individual asset is also a concern.

I'm not opposed to making the same asset lookup time better, but we can't do it at the expense of a non-duplicate asset lookup.

@pierre-pretorius
Copy link

We have about 300 CSS & JS files in our application. Our html head is contained in a partial. In development mode this partial renders in 60ms with sprockets 2. With sprockets 3 it takes 100ms with debug: false and 700ms with debug: true. With debug I'm referring to asset debugging.

Since you can't use sprockets 2 with Rails 5 it seems our current options are either to use debug: false or move over to the webpacker gem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests