Watch mode performance issue in `seal` process #8126

MLoughry · 2018-10-02T17:26:35Z

Feature request

What is the expected behavior?
Watch mode rebuilds would re-use, as much as feasible, the chunk graph from the previous rebuild, to optimize build speed

What is motivation or use case for adding/changing the behavior?
Our project has >12k modules, and using the latest webpack (without monkey-patching alluded to in #7546) results in 75-90s rebuild times. Earlier 4.x builds would rebuild in 15-20s.

When profiling a run where the only change I make to trigger a rebuild is adding a newline to a code file,
79% of time is spent in Compilation.seal(), and another 11% in garbage collection. Compilation.processDependenciesBlocksForChunkGroups() alone counts for 76% of the rebuild time (including GC).

How should this be implemented in your opinion?
I have no idea; I'd need guidance from @sokra or someone else more familiar with this area. I looked at simply optimizing processDependenciesBlocksForChunkGroups, but couldn't get a good enough understanding of the code to make any meaningful change. I'm also not sure how a rebuild Compilation re-uses data from a previous compilation, so I had no idea how to even begin investigating that route.

Are you willing to work on this yourself?
Yes; I've been given a couple dev weeks to work on our team's build performance. However, I don't think I can make any meaningful progress without some guidance.

The text was updated successfully, but these errors were encountered:

MLoughry · 2018-10-02T17:30:30Z

I've hosted the CPU profiles for the above stats @ https://github.com/MLoughry/webpack-8126-profiles

sokra · 2018-10-03T08:29:16Z

Could you remove the breakpoints while profiling?

MLoughry · 2018-10-03T16:47:18Z

🤦 Sorry about that. I've pushed new profiles without breakpoints.

The regressions is much less dire, but the areas for improvement remain the same. In a 21.7s rebuild, 11.7s is spent in processDependenciesBlocksForChunkGroups (including GC)doing work that is unneeded (in this particular rebuild, and likely in many/most rebuilds)

sokra · 2018-10-04T09:12:53Z

Thanks.

I think the slowness is related to this piece of code:

webpack/lib/Compilation.js

Lines 1828 to 1845 in 4c461e2

    
           // 1. Get minimal available modules 
        
           // It doesn't make sense to traverse a chunk again with more available modules. 
        
           // This step calculates the minimal available modules and skips traversal when 
        
           // the list didn't shrink. 
        
           let minAvailableModules = minAvailableModulesMap.get(chunkGroup); 
        
           if (minAvailableModules === undefined) { 
        
           	minAvailableModulesMap.set(chunkGroup, new Set(availableModules)); 
        
           } else { 
        
           	let deletedModules = false; 
        
           	for (const m of minAvailableModules) { 
        
           		if (!availableModules.has(m)) { 
        
           			minAvailableModules.delete(m); 
        
           			deletedModules = true; 
        
           		} 
        
           	} 
        
           	if (!deletedModules) continue; 
        
           	availableModules = minAvailableModules; 
        
           }

or this

webpack/lib/Compilation.js

Line 1861 in 4c461e2

const filteredDeps = deps.filter(filterFn);

The algorithm tracks paths through the ChunkGroups of the application and finds the minimal modules available at each ChunkGroup. Depending on the number of possible paths it can be expensive.

The first snippet is entered for each ChunkGroup with the availableModules from parent. When no information is available for the entered ChunkGroup the list is copied. If there is already some information available we calculate the intersection for the available modules. In your profile the first part seem to be responsible for 2.5s and the second part maybe for 7.5s. A low hanging fruit would be: In step 8 add a needCopy flag witch is false when nextChunkGroups.size === 1 and avoid to copy the list in step 1.

Or the second snippet is responsible for the 7.5s (the areModulesAvailable in filterFn). It tries to remove edges in the graph where all modules are already available. A low hanging fruit would be: newAvailableModules can only shrink for an egde. So when an egde is not filtered it will never be filtered. We can store this i. e. in the edge and skip calculation next time.

Another easy idea is to merge step 4 and 5 and filter deps while iterating. There is not really a need to create a new array with filter. GC is pretty heavy here so avoiding to create objects could help.

Let's start with measuring to figure out if these snippets are nearly responsible for the slowness. The low hanging fruits can be tackled anyway (send a PR).

MLoughry · 2018-10-04T23:38:14Z

I continue to find boneheaded mistakes on my part. Apparently, I was profiling against a version of webpack from May, when I last contributed, rather than the most recent. I've re-profiled, and updated the repo.

I've also made changes as I believe you intended, but they're not leading to perf gains as expected. It's entirely probable I misunderstood your suggestsions. Either way, the profiles for those are posted to the same repo (appended with -changes)

I've opened #8143 for conversation and comments. From my limited profiling, I'm not seeing any improvements. I'll try running more rigorous benchmarks tomorrow, but I wanted to get something back to you before EU Friday.

sokra · 2018-10-05T07:55:14Z

Use the lastest node.js version
Use the lastest webpack version
Try node --max-old-space-size=8192 (maybe you are at the memory limit and GC get slow there)

If that doesn't help you could try to make queue2 a priority queue and process the shortest available modules list first. This should reduce the number of loops needed.

Maybe also track some statistics the the algorithm, like the number of inputChunkGroups, the number of loops and how often the loop is exited here:

webpack/lib/Compilation.js

Line 1854 in a868789

if (!deletedModules) continue;

Technically we could leave the loop before step 2:

webpack/lib/Compilation.js

Line 1858 in a868789

// 2. Get the edges at this point of the graph

when the same chunkGroup is already queued up again. Maybe track the count how often a chunkGroup is queued up and exit early when count > 0.

MLoughry · 2018-10-17T21:12:30Z

Sorry, I got sidetracked on other projects, but I'm looping back around to this.

Maybe also track some statistics the the algorithm, like the number of inputChunkGroups, the number of loops and how often the loop is exited here:

I have the stats, but I'm not sure how to interpret them. In our build, on an arbitrary recent commit, we have 5 inputChunkGroups, and do 9027 total loops in the "Part 2" while loop. 8563 (or ~95%) of the loops exit out in the deleteModules check.

I implemented a priority queue for your first suggestion in your last post @ https://github.com/MLoughry/webpack/tree/priority-queue. It seems to result in a minor perf gain, but pales in comparison to the monkey patch I mention in #7546.

Running a benchmark of 20 rebuilds:

test run	avg
webpack@4.21.0	6.78s
priority-queue branch	6.44s
webpack@4.21.0 with monkey-patch	4.82s

However, I either don't fully understand your second suggestion, or it results in a few snapshot test failures.

test/StatsTestCases.test.js
  ● StatsTestCases › should print correct stats for module-deduplication-named

    expect(value).toMatchSnapshot()

    Received value does not match stored snapshot "StatsTestCases should print correct stats for module-deduplication-named 1".

    - Snapshot
    + Received

    @@ -1,11 +1,11 @@
      "    Asset       Size  Chunks             Chunk Names
    -     e1.js   9.28 KiB       0  [emitted]  e1
    +     e1.js   9.26 KiB       0  [emitted]  e1
          e2.js    9.3 KiB       1  [emitted]  e2
    -     e3.js   9.32 KiB       2  [emitted]  e3
    - async1.js  820 bytes       3  [emitted]  async1
    - async2.js  820 bytes       4  [emitted]  async2
    +     e3.js   9.31 KiB       2  [emitted]  e3
    + async1.js  803 bytes       3  [emitted]  async1
    + async2.js  753 bytes       4  [emitted]  async2
      async3.js  820 bytes       5  [emitted]  async3
      Entrypoint e1 = e1.js
      Entrypoint e2 = e2.js
      Entrypoint e3 = e3.js
      chunk    {0} e1.js (e1) 144 bytes >{3}< [entry] [rendered]
    @@ -15,23 +15,22 @@
       [5] ./e1.js 108 bytes {0} [built]
       [6] ./c.js 9 bytes {0} [built]
      chunk    {1} e2.js (e2) 144 bytes >{4}< [entry] [rendered]
       [0] ./b.js 9 bytes {0} {1} {2} [built]
       [1] ./a.js 9 bytes {0} {1} {2} [built]
    -  [3] ./f.js 9 bytes {1} {4} [built]
    +  [4] ./f.js 9 bytes {1} [built]
       [7] ./e2.js 108 bytes {1} [built]
       [8] ./e.js 9 bytes {1} [built]
      chunk    {2} e3.js (e3) 144 bytes >{5}< [entry] [rendered]
        [0] ./b.js 9 bytes {0} {1} {2} [built]
        [1] ./a.js 9 bytes {0} {1} {2} [built]
    -   [4] ./h.js 9 bytes {2} {5} [built]
    +   [3] ./h.js 9 bytes {2} {5} [built]
        [9] ./e3.js 108 bytes {2} [built]
       [10] ./g.js 9 bytes {2} [built]
    - chunk    {3} async1.js (async1) 89 bytes <{0}> <{5}> >{4}< [rendered]
    + chunk    {3} async1.js (async1) 89 bytes <{0}> <{5}> [rendered]
        [2] ./d.js 9 bytes {0} {3} [built]
       [11] ./async1.js 80 bytes {3} [built]
    - chunk    {4} async2.js (async2) 89 bytes <{1}> <{3}> >{5}< [rendered]
    -   [3] ./f.js 9 bytes {1} {4} [built]
    + chunk    {4} async2.js (async2) 80 bytes <{1}> >{5}< [rendered]
       [12] ./async2.js 80 bytes {4} [built]
      chunk    {5} async3.js (async3) 89 bytes <{2}> <{4}> >{3}< [rendered]
    -   [4] ./h.js 9 bytes {2} {5} [built]
    +   [3] ./h.js 9 bytes {2} {5} [built]
       [13] ./async3.js 80 bytes {5} [built]"

sokra · 2018-10-18T16:15:26Z

hmm... I need to look into that...

sokra · 2018-10-18T16:20:47Z

webpack@4.21.0 with monkey-patch

btw. this is not valid solution since it's broken. It's shortcutting part of the calculation because it's accidentally modifying the array.

sokra · 2018-10-18T16:26:28Z

hmm... I tried your branch, but it completes all StatsCases just fine.

I'll create a PR from your branch to let the CI take a look.

sokra · 2018-10-18T16:27:54Z

#8238

MLoughry · 2018-10-18T16:30:28Z

The priority-queue branch doesn't include the second suggestion (which caused the failure). I hadn't committed those changes (and they're currently sitting on my home computer while I'm at work).

I agree the monkey-patch is not a valid general-purpose solution, as the change you made that my monkey patch undoes was to fix a legitimate bug. However, if a compilation never had that original issue (as ours doesn't), then the "incorrectness" is irrelevant.

sokra · 2018-10-19T08:39:20Z

Could you try this branch: https://github.com/webpack/webpack/tree/perf/chunk-graph

ChristianIvicevic · 2019-04-08T19:17:31Z

I am having performance issues with this as well since webpack is going through all unrelated files as well described in #9020.

SleepWalker · 2019-04-11T14:19:10Z

I've found interesting thing. If I use babel-plugin-dynamic-import-node-babel-7, that replaces all imports with require, the execution time of processDependenciesBlocksForChunkGroups drops from 3-4s to 500ms (the most of the work performed in while (queue2.length) { loop).

As a result, I've decided to use that plugin for both node and browser targets in dev mode for faster builds (1.5-2s incremental build). There are chances, that it's something with my webpack configuration, but I can't imagine which setting can slowdown imports processing.

webpack-bot · 2019-10-11T07:01:55Z

This issue had no activity for at least half a year.

It's subject to automatic issue closing if there is no activity in the next 15 days.

alexander-akait · 2019-10-11T11:39:26Z

Should be fixed #8242, anyway if you still have problem with perf please open new issue with examples, thanks

sibelius · 2020-03-17T20:22:57Z

is this fixed for you @SleepWalker?

SleepWalker · 2020-03-17T20:39:24Z

@sibelius, sorry, I can't recheck it. I have left that project more than a half year ago and have no access to it any more :(

webpack-bot added the Send a PR label Oct 4, 2018

MLoughry mentioned this issue Oct 4, 2018

Make changes suggested by @sokra to optimize chunk graph generation #8143

Merged

sokra mentioned this issue Oct 19, 2018

Optimize chunk graph algorithm #8242

Merged

MLoughry mentioned this issue Jan 30, 2019

[webpack-5] Default webpackMode to "eager" in mode: "development" #8644

Open

webpack-bot added the inactive label Oct 11, 2019

alexander-akait closed this as completed Oct 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Watch mode performance issue in `seal` process #8126

Watch mode performance issue in `seal` process #8126

MLoughry commented Oct 2, 2018

MLoughry commented Oct 2, 2018

sokra commented Oct 3, 2018

MLoughry commented Oct 3, 2018

sokra commented Oct 4, 2018

MLoughry commented Oct 4, 2018

sokra commented Oct 5, 2018

MLoughry commented Oct 17, 2018

sokra commented Oct 18, 2018

sokra commented Oct 18, 2018

sokra commented Oct 18, 2018

sokra commented Oct 18, 2018

MLoughry commented Oct 18, 2018

sokra commented Oct 19, 2018

ChristianIvicevic commented Apr 8, 2019

SleepWalker commented Apr 11, 2019

webpack-bot commented Oct 11, 2019

alexander-akait commented Oct 11, 2019

sibelius commented Mar 17, 2020

SleepWalker commented Mar 17, 2020

Watch mode performance issue in seal process #8126

Watch mode performance issue in seal process #8126

Comments

MLoughry commented Oct 2, 2018

Feature request

MLoughry commented Oct 2, 2018

sokra commented Oct 3, 2018

MLoughry commented Oct 3, 2018

sokra commented Oct 4, 2018

MLoughry commented Oct 4, 2018

sokra commented Oct 5, 2018

MLoughry commented Oct 17, 2018

sokra commented Oct 18, 2018

sokra commented Oct 18, 2018

sokra commented Oct 18, 2018

sokra commented Oct 18, 2018

MLoughry commented Oct 18, 2018

sokra commented Oct 19, 2018

ChristianIvicevic commented Apr 8, 2019

SleepWalker commented Apr 11, 2019

webpack-bot commented Oct 11, 2019

alexander-akait commented Oct 11, 2019

sibelius commented Mar 17, 2020

SleepWalker commented Mar 17, 2020

Watch mode performance issue in `seal` process #8126

Watch mode performance issue in `seal` process #8126