New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache flow branch label results the same way as flow loop results #41387
Conversation
@typescript-bot perf test this |
Heya @weswigham, I've started to run the perf test suite on this PR at 6874d65. You can monitor the build here. Update: The results are in! |
@weswigham Here they are:Comparison Report - master..41387
System
Hosts
Scenarios
|
Excellent - perf seems fine. @ahejlsberg any opinions? You've been pretty vocal about how we approach caching in control flow in the past, but I figure this is probably pretty safe, since it's an extension of the caching we already did at loop labels. |
I'm concerned that we'll be caching a lot more data than we currently do as branch labels are far more common than loop labels. It does look like we're consuming more memory in the Angular, Monaco, and TFS suites, and the check times are slightly higher for those as well. It'd be great to find a way to only cache when it matters, particularly as it very rarely does. |
Less than half a percent, and not uniformly across node versions. Memory usage in |
It matters in every case more complex than
(which is to say a control flow graph with only a single statement with a single branch label) and those trivial examples likely make up a very small fraction of code... |
I think I found the root cause of the issue. After digging into the flow graph a bit, it became apparent that the If we keep track of the shared node and register its result as computed for the antecedent I no longer see this blowing up: --- src/compiler/checker.ts (revision 6874d6514dea4679c39185790764bbf990777567)
+++ src/compiler/checker.ts (date 1604522021008)
@@ -21685,6 +21685,13 @@
return key = getFlowCacheKey(reference, declaredType, initialType, flowContainer);
}
+ function addSharedFlowType(flow: FlowNode, type: FlowType) {
+ // Record visited node and the associated type in the cache.
+ sharedFlowNodes[sharedFlowCount] = flow;
+ sharedFlowTypes[sharedFlowCount] = type;
+ sharedFlowCount++;
+ }
+
function getTypeAtFlowNode(flow: FlowNode): FlowType {
if (flowDepth === 2000) {
// We have made 2000 recursive invocations. To avoid overflowing the call stack we report an error
@@ -21695,6 +21702,7 @@
return errorType;
}
flowDepth++;
+ let sharedFlow: FlowNode | undefined;
while (true) {
const flags = flow.flags;
if (flags & FlowFlags.Shared) {
@@ -21719,6 +21730,9 @@
else if (flags & FlowFlags.Call) {
type = getTypeAtFlowCall(<FlowCall>flow);
if (!type) {
+ if (!sharedFlow && flags & FlowFlags.Shared) {
+ sharedFlow = flow;
+ }
flow = (<FlowCall>flow).antecedent;
continue;
}
@@ -21775,11 +21789,11 @@
// simply return the non-auto declared type to reduce follow-on errors.
type = convertAutoToAny(declaredType);
}
+ if (sharedFlow) {
+ addSharedFlowType(sharedFlow, type);
+ }
if (flags & FlowFlags.Shared) {
- // Record visited node and the associated type in the cache.
- sharedFlowNodes[sharedFlowCount] = flow;
- sharedFlowTypes[sharedFlowCount] = type;
- sharedFlowCount++;
+ addSharedFlowType(flow, type);
}
flowDepth--;
return type; @weswigham @ahejlsberg If this does indeed look like a correct approach then I'd be interested in opening a PR with this change 😃 |
@JoostK that looks like another valid approach to the issue - a PR seems welcome to me. |
See #41665 for what I think is the simplest possible fix. |
Fixes #41124
Fundamentally, the underlying issue actually wasn't tied to control flow call nodes, but rather the repeated branch labels produced by the
&&
expressions. As we checked the file, on the first call expression line, we'd check a control flow graph likeand then on the second:
and so on. The graph of the last call expression line grows linearly with lines, however because no results were cached across invocations (as only branch labels were actually considered, since all the
Call
nodes are noops), this meant that the actual number of control flow nodes checked across all invocations would grow at a much faster rate than input size. By caching the results at branch label locations (like we already do for loop labels, since we always knew we visit them multiple times), we can avoid this growth in most circumstances. Technically this caching is extraneous in certain edge cases where the full control flow graph is trivially small (so is essentially one label and aStart
, so we won't revisit the branch on a future invocation because there won't be a future invocation), but the structure of the control flow graph precludes detecting those scenarios (we can't know if the graph we're looking at is a subset of a large graph or not).I say we can avoid super-linear growth in most circumstances because there is an exception -
ReduceLabel
control flow nodes, used forfinally
block handling, require we bust the cache, as they cause us to mutate other nodes, which has follow on undesirable analysis effects if cached. Discovering this leads me to believe we may have an existing bug withfor
loops andfinally
blocks not giving correct results, but as I couldn't come up with a test case, I left the for-loop label result caching logic as-is.