Design Meeting Notes, 11/5/2021 #46697

DanielRosenwasser · 2021-11-05T19:06:45Z

Changes to Recursion Depth Checks

#46599

Because our type system is structural, we often end up doing very deep comparisons of members.
Often we're able to "cheat" by seeing if the types are identical, or calculating variance and relating type arguments, but it's not always possible to avoid that.
- More that just that, these structural checks can often go infinitely deep.
```
type Box<T> = {
    value: Box<Box<T>>;
};
declare let x: Box<string>;
declare let y: Box<number>;
x = y;
```
We used to say "if we see 5 levels of different instantiations of Box, we're not learning anything new" and stop.
- "Turtle cutoff"
- We've tweaked this over the years.
What does it mean to say "we saw 5 different instantiations of a type?"
- Based on how a type was declared, we use some component of that which we call a "recursion identity". Whenever we relate types, we push these recursion identities on the stack. If you see the same identity some number of times in that stack, we stop.
```
type DeepPayload<T> = {
    [K in keyof T]: DeepPayload<K[T]> & ActualPayload<K, T>
};
```
- We've had trouble with this in the past; have we fixed this?
- Well this should have been fixed in the past.
- A big part of this had to do with grabbing a recursion identity for the indexed access type (which is Foo for Foo[K1][K2][K3]...[Kn]).
One change in the PR is to swap the max depth to 3 for these checks.
- This actually saves a ton of work in libraries whose types grow in a very "bushy" way.
- Not a lot of tests here in the performance test suite though. 😔
But this exacerbates another issue which is - this cutoff is technically incorrect!
- If you manually write out Foo<Foo<Foo<Foo<Foo<Foo<Foo<number>>>>>>> and Foo<Foo<Foo<Foo<Foo<Foo<Foo<string>>>>>>>, then you'll defeat the type-checker because of this depth limit!
- Part of the issue is that a generated type is way more suspicious than something a user manually wrote out.
So the fix here: look at the type IDs!
- If you have Foo<Foo<Foo<string>>>, each instantiation of Foo that's deeper in the type has a lower type ID
- But if the type IDs always increase while relating, that's a sort of proof that the type system is generating new instantiations while relating.
So only count the type IDs that increase in the recursion stack.
Also - we have this logic that tries to leverage already-done checks around unconstrained type parameters.
```
type Foo<T> = ...;
type Bar<T> = ...;

function yadda<T>(x: Foo<T>, y: Bar<T>) {
    x = y;
}
```
- When you have an instantiation of Foo and Bar with the same unconstrained type parameter, you've done an abstract comparison that applies to every single relation of Foo and Bar with the same type argument.
- We do a non-trivial amount of work to look up whether that's been done and leverage that existing work.
- The thing we used to do on that was running a regex on relationship cache keys.
- We calculate these keys a little bit better now.

Ideas Around Performance in the Checker

We're thinking about performance a lot.
Canonicalize type parameter instantiations
- Type parameters are a factory of new type IDs
- map and filter and reduce etc. use a whole bunch of type parameters named U.
- Each Collection<U> has a totally new instantiation.
- From the checker's perspective, it's a waste of work; they're almost always identical.
  - Though from the language service's perspective, each U being distinct is meaningful for go-to-definition and find-all-refs.
  - What if we could canonicalize these types?
- How much work is that canonicalization?
  - Easyish for unconstrainted type parameters.
  - More involved for constrained - have to "normalize" signatures first.
Determining references that can actually be narrowed
- Enormous control flow graphs that need to be walked through for a variable just to find that it doesn't change anything.
- 70k calls to get the control-flow analyzed type, only 6800 actually change the type.
- Binder could keep track of identifiers that occur in narrowing expressions in some qualifying scope (function body? source file?), and we could consult those before walking up the control flow graph.
Use getNodeId less (for NodeLinks)
- Remove calls of getNodeId in compiler outside of the checker #46595
- In theory, don't need the IDs outside the checker - always use getNodeId to get a valid Node ID
- Well, except the transforms.
- Also, composite keys!
  - Could do a Map to a Map
- We actually ALWAYS set Node IDs to 0 in the constructor, so current PR won't help there. Have to remove all the uses.
- Could also try a WeakMap - in theory faster than a Map since the WeakMap implementation could hold the value directly on the Node.
  - Would be surprised if V8 didn't do this, but wouldn't that also change the shape of a Node type?
    - Don't think so? But unclear.
  - Maybe - beware WeakMaps and their unpredictable performance.
- Regardless of Map vs. WeakMap, you have to do a node ID-ectomy - remove all occurrences of getNodeId.

The text was updated successfully, but these errors were encountered:

DanielRosenwasser added the Design Notes Notes from our design meetings label Nov 5, 2021

DanielRosenwasser closed this as completed May 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design Meeting Notes, 11/5/2021 #46697

Design Meeting Notes, 11/5/2021 #46697

DanielRosenwasser commented Nov 5, 2021

Design Meeting Notes, 11/5/2021 #46697

Design Meeting Notes, 11/5/2021 #46697

Comments

DanielRosenwasser commented Nov 5, 2021

Changes to Recursion Depth Checks

Ideas Around Performance in the Checker