subtypes: fast path for Union/Union subtype check

Enums are exploded into Union of Literal when narrowed. Conditional branches on enum values can result in multiple distinct narrowing of the same enum which are later subject to subtype checks (most notably via `is_same_type`, when exiting frame context in the binder). Such checks would have quadratic complexity: `O(N*M)` where `N` and `M` are the number of entries in each narrowed enum variable, and led to drastic slowdown if any of the enums involved has a large number of valuees. Implemement a linear-time fast path where literals are quickly filtered, with a fallback to the slow path for more complex values. In our codebase there is one method with a chain of a dozen if statements operating on instances of an enum with a hundreds of values. Prior to the regression it was typechecked in less than 1s. After the regression it takes over 13min to typecheck. This patch fully fixes the regression for us. Fixes #13821
python · Dec 21, 2022 · cc5bab6 · cc5bab6
1 parent 2514610
commit cc5bab6
Showing 1 changed file with 37 additions and 5 deletions.
diff --git a/mypy/subtypes.py b/mypy/subtypes.py
@@ -57,6 +57,7 @@
     UninhabitedType,
     UnionType,
     UnpackType,
+    flatten_nested_unions,
     get_proper_type,
     is_named_instance,
 )
@@ -877,19 +878,50 @@ def visit_overloaded(self, left: Overloaded) -> bool:
             return False
 
     def visit_union_type(self, left: UnionType) -> bool:
-        if isinstance(self.right, Instance):
+        if isinstance(self.right, (UnionType, Instance)):
+            # prune literals early to avoid nasty quadratic behavior which would otherwise arise when checking
+            # subtype relationships between slightly different narrowings of an Enum
+            # we achieve O(N+M) instead of O(N*M)
+
+            right_lit_types: set[Instance] = set()
+            right_lit_values: set[LiteralType] = set()
+
+            if isinstance(self.right, UnionType):
+                for item in flatten_nested_unions(
+                    self.right.relevant_items(), handle_type_alias_type=True
+                ):
+                    p_item = get_proper_type(item)
+                    if isinstance(p_item, LiteralType):
+                        right_lit_values.add(p_item)
+                    elif isinstance(p_item, Instance):
+                        if p_item.last_known_value is None:
+                            right_lit_types.add(p_item)
+                        else:
+                            right_lit_values.add(p_item.last_known_value)
+            elif isinstance(self.right, Instance):
+                if self.right.last_known_value is None:
+                    right_lit_types.add(self.right)
+                else:
+                    right_lit_values.add(self.right.last_known_value)
+
             literal_types: set[Instance] = set()
-            # avoid redundant check for union of literals
             for item in left.relevant_items():
                 p_item = get_proper_type(item)
+                if p_item in right_lit_types or p_item in right_lit_values:
+                    continue
                 lit_type = mypy.typeops.simple_literal_type(p_item)
                 if lit_type is not None:
-                    if lit_type in literal_types:
+                    if lit_type in right_lit_types:
                         continue
-                    literal_types.add(lit_type)
-                    item = lit_type
+                    if isinstance(self.right, Instance):
+                        if lit_type in literal_types:
+                            continue
+                        literal_types.add(lit_type)
+                        item = lit_type
+
                 if not self._is_subtype(item, self.orig_right):
                     return False
+
             return True
         return all(self._is_subtype(item, self.orig_right) for item in left.items)