checkexpr: cache type of container literals when possible #12707

huguesb · 2022-05-01T07:25:59Z

When a container (list, set, tuple, or dict) literal expression is
used as an argument to an overloaded function it will get repeatedly
typechecked. This becomes particularly problematic when the expression
is somewhat large, as seen in #9427

To avoid repeated work, add a new field in the relevant AST nodes to
cache the resolved type of the expression. Right now the cache is
only used in the fast path, although it could conceivably be leveraged
for the slow path as well in a follow-up commit.

To further reduce duplicate work, when the fast-path doesn't work, we
use the cache to make a note of that, to avoid repeatedly attempting to
take the fast path.

For #9427

huguesb · 2022-05-03T20:32:51Z

@JukkaL @hauntsaninja I would appreciate a review. I realize that this approach to caching might cause some debate but I do think it is worthwhile.

JukkaL · 2022-05-09T15:54:49Z

[Not a full review.] I don't love the idea of putting the cached inferred type in the AST node. It can get stale if we reuse the AST in daemon, etc. It seems like it could cause some hard-to-debug bugs.

However, caching intermediate results is a good idea -- we just need an implementation that isn't error-prone. One idea would be to maintain a cache in TypeChecker as a dict from AST node to inferred type. We can then flush the cache at the end of each function, module or block, for example, to avoid cached results impacting results in another type checking pass, etc.

huguesb · 2022-05-09T19:17:25Z

[Not a full review.] I don't love the idea of putting the cached inferred type in the AST node. It can get stale if we reuse the AST in daemon, etc. It seems like it could cause some hard-to-debug bugs.

Not attached to this approach, although I'm curious when the AST would get reused? Presumably the daemon will have to parse files again if they have changed? Or might an AST be kept around and re-checked after a dependency is updated?

However, caching intermediate results is a good idea -- we just need an implementation that isn't error-prone. One idea would be to maintain a cache in TypeChecker as a dict from AST node to inferred type. We can then flush the cache at the end of each function, module or block, for example, to avoid cached results impacting results in another type checking pass, etc.

Sure, I can rework this to use a dict cache instead.

When a container (list, set, tuple, or dict) literal expression is used as an argument to an overloaded function it will get repeatedly typechecked. This becomes particularly problematic when the expression is somewhat large, as seen in python#9427 To avoid repeated work, add a new cache in ExprChecker, mapping the AST node to the resolved type of the expression. Right now the cache is only used in the fast path, although it could conceivably be leveraged for the slow path as well in a follow-up commit. To further reduce duplicate work, when the fast-path doesn't work, we use the cache to make a note of that, to avoid repeatedly attempting to take the fast path. Fixes python#9427

github-actions · 2022-05-16T08:19:57Z

According to mypy_primer, this change has no effect on the checked open source code. 🤖🎉

JukkaL

Looks good now, thanks for the updates! Large container literals are common, and it's nice to be able to type check more of them quickly.

When a container (list, set, tuple, or dict) literal expression is used as an argument to an overloaded function it will get repeatedly typechecked. This becomes particularly problematic when the expression is somewhat large, as seen in #9427 To avoid repeated work, add a new cache in ExprChecker, mapping the AST node to the resolved type of the expression. Right now the cache is only used in the fast path, although it could conceivably be leveraged for the slow path as well in a follow-up commit. To further reduce duplicate work, when the fast-path doesn't work, we use the cache to make a note of that, to avoid repeatedly attempting to take the fast path. Fixes #9427

huguesb force-pushed the pr-fast-container-type-cache branch from 8b2bf54 to 3acbc1f Compare May 1, 2022 07:26

huguesb mentioned this pull request May 1, 2022

Chained calls with medium/large dict/list literals are slow to typecheck #9427

Closed

This comment has been minimized.

Sign in to view

huguesb force-pushed the pr-fast-container-type-cache branch from 3acbc1f to ac11374 Compare May 3, 2022 20:31

This comment has been minimized.

Sign in to view

huguesb force-pushed the pr-fast-container-type-cache branch from ac11374 to af675fe Compare May 16, 2022 07:47

huguesb mentioned this pull request May 18, 2022

Release 0.960 planning #12807

Closed

JukkaL approved these changes May 20, 2022

View reviewed changes

JukkaL merged commit 1b7e33f into python:master May 20, 2022

pranavrajpal mentioned this pull request May 31, 2022

Refactor mypy to use query-based architecture #12911

Open

huguesb deleted the pr-fast-container-type-cache branch June 10, 2022 08:37

huguesb mentioned this pull request Dec 10, 2022

Wildly inconsistent performance for seemingly trivial changes to source code #14271

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

checkexpr: cache type of container literals when possible #12707

checkexpr: cache type of container literals when possible #12707

huguesb commented May 1, 2022 •

edited

This comment has been minimized.

This comment has been minimized.

huguesb commented May 3, 2022

This comment has been minimized.

JukkaL commented May 9, 2022

huguesb commented May 9, 2022 •

edited

github-actions bot commented May 16, 2022

JukkaL left a comment

checkexpr: cache type of container literals when possible #12707

checkexpr: cache type of container literals when possible #12707

Conversation

huguesb commented May 1, 2022 • edited

This comment has been minimized.

This comment has been minimized.

huguesb commented May 3, 2022

This comment has been minimized.

JukkaL commented May 9, 2022

huguesb commented May 9, 2022 • edited

github-actions bot commented May 16, 2022

JukkaL left a comment

Choose a reason for hiding this comment

huguesb commented May 1, 2022 •

edited

huguesb commented May 9, 2022 •

edited