Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implicit yield performance fix #17096

Merged
merged 12 commits into from May 3, 2024

Conversation

krauthaufen
Copy link
Contributor

@krauthaufen krauthaufen commented Apr 26, 2024

Description

Hey, we're heavily using implicit yields in CE builders for our UI library and have incredibly slow autocomplete & compile times for relatively simple files.

After some investigation we found that the runtime scales exponentially with the nesting depth of such CE builders, like:

div {
    div {
        ul {
            li {
                // at about 6-10 levels performance will suffer drastically
                "Item 1"
            }
        }
        "Hello"
    }
    "World"
}

A quick repro can be found here: https://gist.github.com/krauthaufen/a23a622beac99a89f5ce9d992266ab91

There you see compile-times around 10s for a 50-line file and whenever you change anything all completions & type-hovers will take a comparable amount of time.

So I did some digging in the compiler-source and found the exponential blow-up which i circumvented by simply caching already type-checked expressions.

This solves our issue and compile-time for the file is down to approx. 0.5s.
In order to test this properly i checked your benchmarks and found no difference in runtimes, which showed that all the CE-benchmarks do not clear the FsChecker caches and therefore all tests basically just returned cached results. This of course is not a realistic benchmark for the type-checker.

Therefore I adjusted the benchmarks in this PR to run with and without cleared caches. The results show the expected exponential behaviour with the current head and a drastically improved behaviour with my caching fix.

Here are the most important benchmarks compared with cleared FsChecker-Caches (full report below)

With TC-Caching (this PR)

Method Source Mean Error StdDev Median Gen0 Gen1 Gen2 Allocated
CheckCE CE100xnest1.fs 324.733 ms 2.6600 ms 2.2212 ms 325.190 ms 1000.0000 1000.0000 1000.0000 298976.55 KB
CheckCE CE100xnest5.fs 1,678.710 ms 11.0937 ms 10.3771 ms 1,678.842 ms 4000.0000 2000.0000 1000.0000 1168517.37 KB
CheckCE CE1xnest15.fs 73.001 ms 1.3875 ms 2.7063 ms 72.631 ms - - - 113193.98 KB
CheckCE CE200xnest5.fs 10,435.441 ms 72.6876 ms 67.9920 ms 10,444.004 ms 49000.0000 16000.0000 4000.0000 3692512.19 KB

Without TC-Caching (current main branch)

Method Source Mean Error StdDev Gen0 Gen1 Gen2 Allocated
CheckCE CE100xnest1.fs 384.0 ms 7.46 ms 6.61 ms 1000.0000 1000.0000 1000.0000 311.17 MB
CheckCE CE100xnest5.fs 3,029.7 ms 33.97 ms 30.11 ms 25000.0000 8000.0000 2000.0000 2675.71 MB
CheckCE CE1xnest15.fs 11,202.7 ms 163.92 ms 153.33 ms 186000.0000 50000.0000 3000.0000 14481.26 MB
CheckCE CE200xnest5.fs 14,061.1 ms 211.46 ms 197.80 ms 97000.0000 30000.0000 3000.0000 6676.24 MB

I really hope you can merge this, since our UI library heavily relies on implicit yields and we currently code without autocomplete and have 10m compile times for average projects.

Sadly the benchmarks with the current main branch took ages with the cleared FsChecker caches, so i currently only have partial results as seen below, but I think the above tables should be enough to see that11s vs 73ms for a very simple file is drastic.

Let me know if you have any further questions/requirements I need to fix for this to be accepted.

Cheers

Full Benchmark Results for this PR (with cleared FsChecker-Caches)
// * Summary *

BenchmarkDotNet v0.13.10, macOS Sonoma 14.4.1 (23E224) [Darwin 23.4.0]
Apple M1 Max, 1 CPU, 10 logical and 10 physical cores
.NET SDK 9.0.100-preview.2.24157.14
  [Host]     : .NET 8.0.4 (8.0.424.16909), Arm64 RyuJIT AdvSIMD DEBUG
  Job-NHLZOK : .NET 8.0.4 (8.0.424.16909), Arm64 RyuJIT AdvSIMD

InvocationCount=1  UnrollFactor=1  
Method Source EmptyCache Mean Error StdDev Median Gen0 Gen1 Gen2 Allocated
CheckCE CE100xnest1.fs True 324.733 ms 2.6600 ms 2.2212 ms 325.190 ms 1000.0000 1000.0000 1000.0000 298976.55 KB
CompileCE CE100xnest1.fs True 675.787 ms 12.7977 ms 12.5691 ms 676.414 ms 1000.0000 1000.0000 1000.0000 587006.49 KB
CheckCE CE100xnest5.fs True 1,678.710 ms 11.0937 ms 10.3771 ms 1,678.842 ms 4000.0000 2000.0000 1000.0000 1168517.37 KB
CompileCE CE100xnest5.fs True 6,263.758 ms 28.4133 ms 26.5778 ms 6,257.495 ms 49000.0000 17000.0000 3000.0000 4356772.27 KB
CheckCE CE1xnest15.fs True 73.001 ms 1.3875 ms 2.7063 ms 72.631 ms - - - 113193.98 KB
CompileCE CE1xnest15.fs True 111.028 ms 3.0224 ms 8.1712 ms 113.578 ms - - - 158897.6 KB
CheckCE CE200xnest5.fs True 10,435.441 ms 72.6876 ms 67.9920 ms 10,444.004 ms 49000.0000 16000.0000 4000.0000 3692512.19 KB
CompileCE CE200xnest5.fs True 235,692.149 ms 4,266.2506 ms 3,990.6535 ms 237,308.832 ms 1886000.0000 536000.0000 77000.0000 115650907.01 KB
CheckCE CEwCO100xnest5.fs True 193.913 ms 2.7059 ms 2.5311 ms 193.107 ms 1000.0000 1000.0000 1000.0000 260968.91 KB
CompileCE CEwCO100xnest5.fs True 311.200 ms 2.1380 ms 1.7853 ms 310.587 ms 2000.0000 1000.0000 1000.0000 431118 KB
CheckCE CEwCO500xnest1.fs True 126.002 ms 1.4002 ms 1.1693 ms 126.155 ms - - - 146926.84 KB
CompileCE CEwCO500xnest1.fs True 3,849.941 ms 52.3828 ms 48.9989 ms 3,836.408 ms 4000.0000 2000.0000 1000.0000 1701728.95 KB
CheckCE CE100xnest1.fs False 1.851 ms 0.0550 ms 0.1612 ms 1.853 ms - - - 386.04 KB
CompileCE CE100xnest1.fs False 692.532 ms 8.7299 ms 8.1660 ms 690.097 ms 1000.0000 1000.0000 1000.0000 582473.88 KB
CheckCE CE100xnest5.fs False 1.689 ms 0.0582 ms 0.1715 ms 1.647 ms - - - 451.41 KB
CompileCE CE100xnest5.fs False 6,270.485 ms 49.7496 ms 46.5358 ms 6,283.342 ms 55000.0000 17000.0000 3000.0000 4344416.89 KB
CheckCE CE1xnest15.fs False 1.971 ms 0.0606 ms 0.1786 ms 1.950 ms - - - 392.43 KB
CompileCE CE1xnest15.fs False 104.611 ms 3.6779 ms 10.3134 ms 109.419 ms 1000.0000 1000.0000 1000.0000 146415.43 KB
CheckCE CE200xnest5.fs False 1.683 ms 0.0599 ms 0.1756 ms 1.659 ms - - - 524.33 KB
CompileCE CE200xnest5.fs False 238,728.773 ms 2,218.0732 ms 2,074.7871 ms 238,917.222 ms 1893000.0000 549000.0000 77000.0000 115644053.88 KB
CheckCE CEwCO100xnest5.fs False 2.044 ms 0.0713 ms 0.2080 ms 2.024 ms - - - 687.12 KB
CompileCE CEwCO100xnest5.fs False 314.878 ms 2.9447 ms 2.7545 ms 315.616 ms 2000.0000 1000.0000 1000.0000 418887.84 KB
CheckCE CEwCO500xnest1.fs False 2.138 ms 0.0571 ms 0.1674 ms 2.118 ms - - - 472.3 KB
CompileCE CEwCO500xnest1.fs False 3,850.093 ms 51.5484 ms 48.2184 ms 3,843.526 ms 6000.0000 3000.0000 2000.0000 1697368.86 KB
Partial Benchmark Results for main branch (with cleared FsChecker-Caches)
// * Summary *

BenchmarkDotNet v0.13.10, macOS Sonoma 14.4.1 (23E224) [Darwin 23.4.0]
Apple M1 Max, 1 CPU, 10 logical and 10 physical cores
.NET SDK 9.0.100-preview.2.24157.14
  [Host]     : .NET 8.0.4 (8.0.424.16909), Arm64 RyuJIT AdvSIMD DEBUG
  Job-VTEZKW : .NET 8.0.4 (8.0.424.16909), Arm64 RyuJIT AdvSIMD

InvocationCount=1  UnrollFactor=1  
Method Source EmptyCache Mean Error StdDev Gen0 Gen1 Gen2 Allocated
CheckCE CE100xnest1.fs True 384.0 ms 7.46 ms 6.61 ms 1000.0000 1000.0000 1000.0000 311.17 MB
CheckCE CE100xnest5.fs True 3,029.7 ms 33.97 ms 30.11 ms 25000.0000 8000.0000 2000.0000 2675.71 MB
CheckCE CE1xnest15.fs True 11,202.7 ms 163.92 ms 153.33 ms 186000.0000 50000.0000 3000.0000 14481.26 MB
CheckCE CE200xnest5.fs True 14,061.1 ms 211.46 ms 197.80 ms 97000.0000 30000.0000 3000.0000 6676.24 MB
CheckCE CEwCO100xnest5.fs True 204.7 ms 2.71 ms 2.41 ms 2000.0000 1000.0000 1000.0000 254.86 MB
CheckCE CEwCO500xnest1.fs True 132.4 ms 2.63 ms 3.69 ms - - - 143.46 MB

@krauthaufen krauthaufen requested a review from a team as a code owner April 26, 2024 19:32
Copy link
Contributor

github-actions bot commented Apr 26, 2024

❗ Release notes required


✅ Found changes and release notes in following paths:

Change path Release notes path Description
src/Compiler docs/release-notes/.FSharp.Compiler.Service/8.0.400.md

@vzarytovskii
Copy link
Member

Yeah, CE checking is pretty hard and complex currently. This looks like a low hanging fruit. If it doesn't break anything else, it's an easy merge. We need to go through thorough review though.

…ed expressions

* added TType return to TryTcStmt
krauthaufen and others added 2 commits April 29, 2024 13:17
…ng duplicated ranges of implicitly yielded expressions
Co-authored-by: Vlad Zarytovskii <vzaritovsky@hotmail.com>
@krauthaufen
Copy link
Contributor Author

Some error messages in neg24.fs seem to have changed here. It looks like this could be caused by my PR, but I'm not entirely sure if this is relevant, since both error messages basically say the same thing as far as I can tell.

@vzarytovskii
Copy link
Member

vzarytovskii commented Apr 29, 2024

Some error messages in neg24.fs seem to have changed here. It looks like this could be caused by my PR, but I'm not entirely sure if this is relevant, since both error messages basically say the same thing as far as I can tell.

It is related to the changes, seems to be related to type inference (obj vs int), maybe ordering. I've hit the same (or similar) issue when was implementing caching for custom operations type checking in CEs.

@krauthaufen
Copy link
Contributor Author

Good to know, should I try to fix it or just leave it that way?
Is there anything else you need from me?

@T-Gro
Copy link
Member

T-Gro commented Apr 30, 2024

Good to know, should I try to fix it or just leave it that way? Is there anything else you need from me?

It is a change of behavior induced by the change - it does need fixing IMO.
It might be because caching of the typechecked expression is not always the right thing to do, might be that in case of generalized variables (TType_var) a new inference variable should be created instead of reusing the cached down. But that is really a very wild guess, this does need deeper debugging I am afraid.

@krauthaufen
Copy link
Contributor Author

So the only difference I found is that

neg24.fs(331,24,331,25): typecheck error FS0001: The 'if' expression needs to have type 'obj' to satisfy context type requirements. It currently has type 'int'.

now reads:

typecheck error FS0001: This expression was expected to have type
    'obj'    
but here has type
    'int'    

The synExpr1 now freely "invents" its own skolem type which later gets unified with the other one. But in essence these two things say the same thing in different words. And it can only affect implicitly-yielded values right?

@T-Gro
Copy link
Member

T-Gro commented May 2, 2024

But in essence these two things say the same thing in different words.

I agree, it is not a regression. You can change the baseline in the .bsl file to reflect that.

And it can only affect implicitly-yielded values right?

True, that is the only place of impact.

@krauthaufen
Copy link
Contributor Author

Hey, I realized that my type-unification didn't have the proper range (i used range0 for testing and forgot about it). I now fixed the range and the regression seems to have disappeared. The only sensible conclusion for me is that the error gets formatted differently when inside the range of some CE.

Nonetheless the regression is gone and the rest of the failing checks seem to be unrelated to my change.

Cheers

@vzarytovskii
Copy link
Member

Nonetheless the regression is gone and the rest of the failing checks seem to be unrelated to my change.

One failing check is formatting, you can either format locally on your machine (it should be described in DEVGUIDE.md), or comment /run fantomas here.

Regarding other failures, it's currently some fluke it seems.

@dotnet dotnet deleted a comment from github-actions bot May 2, 2024
@krauthaufen
Copy link
Contributor Author

/run fantomas

@krauthaufen
Copy link
Contributor Author

Okay I'll try to do that locally 😉

  Co-authored-by: krauthaufen <6370801+krauthaufen@users.noreply.github.com>
Copy link
Contributor

github-actions bot commented May 2, 2024

@psfinaki
Copy link
Member

psfinaki commented May 2, 2024

/azp run

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@krauthaufen
Copy link
Contributor Author

CI classic 😉

@T-Gro T-Gro enabled auto-merge (squash) May 3, 2024 08:42
@T-Gro T-Gro merged commit bbba58a into dotnet:main May 3, 2024
32 checks passed
@krauthaufen
Copy link
Contributor Author

Nice 🥳🥳🥳

How do you release these things?
I saw that there are approximately monthly releases of dotnet sdks (even old versions get re-released) but when will i realistically see this change reflected in Visual Studio/Ionide/Rider?

We currently work around the problem by explicitly yielding everything but a rough estimate of how long we need to keep this up would be nice.

Thanks to everyone involved!

@baronfel
Copy link
Member

baronfel commented May 3, 2024

The monthly SDK releases are typically only security fixes or big regressions. This kind of fix I'd expect to ship in one of the quarterly SDK releases, so I'd expect something like 8.0.400 in the August time frame.

Ionide builds on the libraries that this repo produces, which are released on the same cadence as the SDK - monthly security/bugfixes, quarterly feature drops.

@krauthaufen
Copy link
Contributor Author

Okay, thanks. Just out of curiosity: does visual studio use the fscs from the relevant sdk or does it have its own copy of things?

@T-Gro
Copy link
Member

T-Gro commented May 3, 2024

VS has its own copy in-proc for intellisense, diagnostics etc. So change would come with next VS Preview.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

6 participants