Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression from 0.750 - Literal types cause significant slowdown #9169

Closed
CommanderPrS opened this issue Jul 18, 2020 · 16 comments
Closed

Regression from 0.750 - Literal types cause significant slowdown #9169

CommanderPrS opened this issue Jul 18, 2020 · 16 comments

Comments

@CommanderPrS
Copy link

Are you reporting a bug, or opening a feature request?

This is a bug. It is a regression from the 0.750 version.

Please insert below the code you are checking with mypy, or a mock-up repro if the source is private. We would appreciate if you try to simplify your case to a minimal repro.

Here is the code that triggers this bug. It is essentially a list of TypedDict objects created in some nested fashion. The definitions for these objects are coming from the boto3-stubs library, which is a typed wrapper around the boto3 library, which is a Python SDK for talking to AWS REST APIs. The same behavior could probably be reproduced with just custom made TypeDict objects, but, given, the number of them and the relationship between them, it is easier just to reuse the ones from the library.

To install it, you need to have this line in requirements.txt (the exact version probably does not matter, but this is what I have):

boto3-stubs[autoscaling,cloudwatch,ec2,lambda,logs] == 1.10.35

IMPORTANT: After running pip install for the above line, one must also run this command to make boto3-stubs initialize itself:

python -m mypy_boto3

If doing this from tox.ini, you need to have this line there:

commands_pre = python -m mypy_boto3

Here is the offending code:

import typing as t

from mypy_boto3.ec2.type_defs import (
    DescribeInstancesResultTypeDef,
    FilterTypeDef,
    InstanceNetworkInterfaceAttachmentTypeDef,
    InstanceNetworkInterfaceTypeDef,
    InstanceStateTypeDef,
    InstanceTypeDef,
    PlacementTypeDef,
    ReservationTypeDef,
    TagTypeDef,
)

x: t.List[DescribeInstancesResultTypeDef] = [
    DescribeInstancesResultTypeDef(
        Reservations=[
            ReservationTypeDef(
                Instances=[
                    InstanceTypeDef(
                        InstanceId="az1-instance1-id1",
                        Placement=PlacementTypeDef(
                            AvailabilityZone="az1",
                        ),
                        State=InstanceStateTypeDef(
                            Name="running",
                        ),
                        NetworkInterfaces=[
                            InstanceNetworkInterfaceTypeDef(
                                Attachment=InstanceNetworkInterfaceAttachmentTypeDef(
                                    DeviceIndex=0,
                                ),
                                NetworkInterfaceId="az1-instance1-nic0-id1",
                                PrivateDnsName="az1-instance1-nic0-dns1",
                                PrivateIpAddress="az1-instance1-nic0-ip1",
                                SourceDestCheck=True,
                            ),
                        ],
                        Tags=[
                            TagTypeDef(
                                Key="Name",
                                Value="az1-instance1",
                            ),
                        ],
                    ),
                ],
            ),
        ],
        NextToken="next-token1",
    ),
    DescribeInstancesResultTypeDef(
        Reservations=[
            ReservationTypeDef(
                Instances=[
                    InstanceTypeDef(
                        InstanceId="az1-instance2-id2",
                        Placement=PlacementTypeDef(
                            AvailabilityZone="az1",
                        ),
                        State=InstanceStateTypeDef(
                            Name="running",
                        ),
                        NetworkInterfaces=[
                            InstanceNetworkInterfaceTypeDef(
                                Attachment=InstanceNetworkInterfaceAttachmentTypeDef(
                                    DeviceIndex=0,
                                ),
                                NetworkInterfaceId="az1-instance2-nic0-id2",
                                PrivateDnsName="az1-instance2-nic0-dns2",
                                PrivateIpAddress="az1-instance2-nic0-ip2",
                                SourceDestCheck=True,
                            ),
                        ],
                        Tags=[
                            TagTypeDef(
                                Key="Name",
                                Value="az1-instance2",
                            ),
                        ],
                    ),
                ],
            ),
        ],
        NextToken="next-token2",
    ),
    DescribeInstancesResultTypeDef(
        Reservations=[
            ReservationTypeDef(
                Instances=[
                    InstanceTypeDef(
                        InstanceId="az2-instance1-id1",
                        Placement=PlacementTypeDef(
                            AvailabilityZone="az2",
                        ),
                        State=InstanceStateTypeDef(
                            Name="running",
                        ),
                        NetworkInterfaces=[
                            InstanceNetworkInterfaceTypeDef(
                                Attachment=InstanceNetworkInterfaceAttachmentTypeDef(
                                    DeviceIndex=0,
                                ),
                                NetworkInterfaceId="az2-instance1-nic0-id1",
                                PrivateDnsName="az2-instance1-nic0-dns1",
                                PrivateIpAddress="az2-instance1-nic0-ip1",
                                SourceDestCheck=True,
                            ),
                        ],
                        Tags=[
                            TagTypeDef(
                                Key="Name",
                                Value="az2-instance1",
                            ),
                        ],
                    ),
                ],
            ),
        ],
    ),
]

What is the actual behavior/output?

When running mypy for this file, version 0.750 takes about 8-10 seconds to complete without any errors (with a brand new completely empty .mypy_cache). All versions after that (for the list of versions I tried, see below) essentially hang and never complete. When I was doing this on the actual project code, I had mypy running for at least half an hour. When running experiments specifically with the above code, I waited for a couple of minutes for each version before killing it in the Task Manager.

What is the behavior/output you expect?

mypy should not hang on this file.

What are the versions of mypy and Python you are using?

> python --version --version
Python 3.7.7 (tags/v3.7.7:d7c567b08f, Mar 10 2020, 10:41:24) [MSC v.1900 64 bit (AMD64)]

mypy versions tried:

  • 0.750 - works
  • 0.761 - hangs
  • 0.770 - hangs
  • 0.780 - hangs
  • 0.782 - hangs

Do you see the same issue after installing mypy from Git master?

I only tried the officially published versions listed above.

What are the mypy flags you are using? (For example --strict-optional)

Here is the content of mypy.ini that I used:

[mypy]
mypy_path = ./stubs

# How imports are handled
ignore_missing_imports = False
follow_imports = normal

# Enforce various type annotations
disallow_any_unimported = True
disallow_any_expr = False
disallow_any_decorated = False
disallow_any_explicit = False
disallow_any_generics = True
disallow_subclassing_any = True
disallow_untyped_calls = True
disallow_untyped_defs = True
disallow_incomplete_defs = True
disallow_untyped_decorators = True

# Various code quality checks
no_implicit_optional = True
strict_optional = True
warn_unused_ignores = True
warn_no_return = True
warn_return_any = True
warn_unreachable = True
warn_redundant_casts = True
allow_untyped_globals = False
allow_redefinition = False
strict_equality = True

# How error messages are printed
show_error_context = False
show_column_numbers = True
show_error_codes = True
pretty = True

If mypy crashed with a traceback, please paste the full traceback below.

mypy does not crash, just hangs. It needs to be killed in the Task Manager to stop it. It does not even respond to Ctrl+C.

@TH3CHARLie
Copy link
Collaborator

TH3CHARLie commented Jul 18, 2020

Are you running on MacOS, if so, check this issue, maybe helpful: #8055

@CommanderPrS
Copy link
Author

@TH3CHARLie

Are you running on MacOS

No, this is Windows 10, 64-bit.

if so, check this issue, maybe helpful: #8055

I do not think this is related to the mypy startup. If this were, then different versions would be hanging in the same way. Yet, 0.750 works and all subsequent versions do not.

Another indicator that this is not a startup issue is this. If mypy were slow to start, then running it for different files would be hanging in exactly the same way. Yet, it is not the case. If I run mypy for the "bad" code, it hangs. If I kill it and immediately run for some "good" code, it works fine. If I retry it again for the "bad" code, it hangs again. Furthermore, even if I run it on the same file, but comment in or out the "bad" parts of it, mypy hangs only when the "bad" parts are present.

Finally, there is another clue here - to make my experiments clean, I completely delete the .mypy_cache directory. When mypy hangs, I see the cache being populated for some modules (like the imported ones) and some files (if I include other "good" files in the scan set). If mypy were not starting properly, .mypy_cache would not be populated at all.

@ethanhs
Copy link
Collaborator

ethanhs commented Jul 19, 2020

I was able to reduce things down to this:
https://gist.github.com/ethanhs/06e528df86e231d3dd4fba25e0123579

The culprit is the large number (270!) of literal values, which seems to be causing the slowdown.

@ethanhs ethanhs changed the title Regression from 0.750 - mypy hangs permanently Regression from 0.750 - Literal types cause significant slowdown Jul 19, 2020
@Akuli
Copy link
Contributor

Akuli commented Jul 21, 2020

The bug was introduced at 25b1c4d Fix 0.750 regression: union (non-)simplification should not affect inference (#8077)

found with dumb bash loop: until (rm -rf .mypy_cache/ && timeout 20 python3 -m mypy foo.py); do git checkout HEAD^; done

@Akuli
Copy link
Contributor

Akuli commented Jul 21, 2020

found the bug

  • make_simplified_union in mypy/typeops.py
  • items argument: [Literal['t1.micro'], Literal['t2.nano'], Literal['t2.micro'], Literal['t2.small'], Literal['t2.medium'], Literal['t2.large'], Literal['t2.xlarge'], Literal['t2.2xlarge'], Literal['t3.nano'], Literal['t3.micro'], Literal['t3.small'], Literal['t3.medium'], Literal['t3.large'], Literal['t3.xlarge'], Literal['t3.2xlarge'], Literal['t3a.nano'], Literal['t3a.micro'], Literal['t3a.small'], Literal['t3a.medium'], Literal['t3a.large'], Literal['t3a.xlarge'], Literal['t3a.2xlarge'], Literal['m1.small'], Literal['m1.medium'], Literal['m1.large'], Literal['m1.xlarge'], Literal['m3.medium'], Literal['m3.large'], Literal['m3.xlarge'], Literal['m3.2xlarge'], Literal['m4.large'], Literal['m4.xlarge'], Literal['m4.2xlarge'], Literal['m4.4xlarge'], Literal['m4.10xlarge'], Literal['m4.16xlarge'], Literal['m2.xlarge'], Literal['m2.2xlarge'], Literal['m2.4xlarge'], Literal['cr1.8xlarge'], Literal['r3.large'], Literal['r3.xlarge'], Literal['r3.2xlarge'], Literal['r3.4xlarge'], Literal['r3.8xlarge'], Literal['r4.large'], Literal['r4.xlarge'], Literal['r4.2xlarge'], Literal['r4.4xlarge'], Literal['r4.8xlarge'], Literal['r4.16xlarge'], Literal['r5.large'], Literal['r5.xlarge'], Literal['r5.2xlarge'], Literal['r5.4xlarge'], Literal['r5.8xlarge'], Literal['r5.12xlarge'], Literal['r5.16xlarge'], Literal['r5.24xlarge'], Literal['r5.metal'], Literal['r5a.large'], Literal['r5a.xlarge'], Literal['r5a.2xlarge'], Literal['r5a.4xlarge'], Literal['r5a.8xlarge'], Literal['r5a.12xlarge'], Literal['r5a.16xlarge'], Literal['r5a.24xlarge'], Literal['r5d.large'], Literal['r5d.xlarge'], Literal['r5d.2xlarge'], Literal['r5d.4xlarge'], Literal['r5d.8xlarge'], Literal['r5d.12xlarge'], Literal['r5d.16xlarge'], Literal['r5d.24xlarge'], Literal['r5d.metal'], Literal['r5ad.large'], Literal['r5ad.xlarge'], Literal['r5ad.2xlarge'], Literal['r5ad.4xlarge'], Literal['r5ad.8xlarge'], Literal['r5ad.12xlarge'], Literal['r5ad.16xlarge'], Literal['r5ad.24xlarge'], Literal['x1.16xlarge'], Literal['x1.32xlarge'], Literal['x1e.xlarge'], Literal['x1e.2xlarge'], Literal['x1e.4xlarge'], Literal['x1e.8xlarge'], Literal['x1e.16xlarge'], Literal['x1e.32xlarge'], Literal['i2.xlarge'], Literal['i2.2xlarge'], Literal['i2.4xlarge'], Literal['i2.8xlarge'], Literal['i3.large'], Literal['i3.xlarge'], Literal['i3.2xlarge'], Literal['i3.4xlarge'], Literal['i3.8xlarge'], Literal['i3.16xlarge'], Literal['i3.metal'], Literal['i3en.large'], Literal['i3en.xlarge'], Literal['i3en.2xlarge'], Literal['i3en.3xlarge'], Literal['i3en.6xlarge'], Literal['i3en.12xlarge'], Literal['i3en.24xlarge'], Literal['i3en.metal'], Literal['hi1.4xlarge'], Literal['hs1.8xlarge'], Literal['c1.medium'], Literal['c1.xlarge'], Literal['c3.large'], Literal['c3.xlarge'], Literal['c3.2xlarge'], Literal['c3.4xlarge'], Literal['c3.8xlarge'], Literal['c4.large'], Literal['c4.xlarge'], Literal['c4.2xlarge'], Literal['c4.4xlarge'], Literal['c4.8xlarge'], Literal['c5.large'], Literal['c5.xlarge'], Literal['c5.2xlarge'], Literal['c5.4xlarge'], Literal['c5.9xlarge'], Literal['c5.12xlarge'], Literal['c5.18xlarge'], Literal['c5.24xlarge'], Literal['c5.metal'], Literal['c5d.large'], Literal['c5d.xlarge'], Literal['c5d.2xlarge'], Literal['c5d.4xlarge'], Literal['c5d.9xlarge'], Literal['c5d.12xlarge'], Literal['c5d.18xlarge'], Literal['c5d.24xlarge'], Literal['c5d.metal'], Literal['c5n.large'], Literal['c5n.xlarge'], Literal['c5n.2xlarge'], Literal['c5n.4xlarge'], Literal['c5n.9xlarge'], Literal['c5n.18xlarge'], Literal['cc1.4xlarge'], Literal['cc2.8xlarge'], Literal['g2.2xlarge'], Literal['g2.8xlarge'], Literal['g3.4xlarge'], Literal['g3.8xlarge'], Literal['g3.16xlarge'], Literal['g3s.xlarge'], Literal['g4dn.xlarge'], Literal['g4dn.2xlarge'], Literal['g4dn.4xlarge'], Literal['g4dn.8xlarge'], Literal['g4dn.12xlarge'], Literal['g4dn.16xlarge'], Literal['cg1.4xlarge'], Literal['p2.xlarge'], Literal['p2.8xlarge'], Literal['p2.16xlarge'], Literal['p3.2xlarge'], Literal['p3.8xlarge'], Literal['p3.16xlarge'], Literal['p3dn.24xlarge'], Literal['d2.xlarge'], Literal['d2.2xlarge'], Literal['d2.4xlarge'], Literal['d2.8xlarge'], Literal['f1.2xlarge'], Literal['f1.4xlarge'], Literal['f1.16xlarge'], Literal['m5.large'], Literal['m5.xlarge'], Literal['m5.2xlarge'], Literal['m5.4xlarge'], Literal['m5.8xlarge'], Literal['m5.12xlarge'], Literal['m5.16xlarge'], Literal['m5.24xlarge'], Literal['m5.metal'], Literal['m5a.large'], Literal['m5a.xlarge'], Literal['m5a.2xlarge'], Literal['m5a.4xlarge'], Literal['m5a.8xlarge'], Literal['m5a.12xlarge'], Literal['m5a.16xlarge'], Literal['m5a.24xlarge'], Literal['m5d.large'], Literal['m5d.xlarge'], Literal['m5d.2xlarge'], Literal['m5d.4xlarge'], Literal['m5d.8xlarge'], Literal['m5d.12xlarge'], Literal['m5d.16xlarge'], Literal['m5d.24xlarge'], Literal['m5d.metal'], Literal['m5ad.large'], Literal['m5ad.xlarge'], Literal['m5ad.2xlarge'], Literal['m5ad.4xlarge'], Literal['m5ad.8xlarge'], Literal['m5ad.12xlarge'], Literal['m5ad.16xlarge'], Literal['m5ad.24xlarge'], Literal['h1.2xlarge'], Literal['h1.4xlarge'], Literal['h1.8xlarge'], Literal['h1.16xlarge'], Literal['z1d.large'], Literal['z1d.xlarge'], Literal['z1d.2xlarge'], Literal['z1d.3xlarge'], Literal['z1d.6xlarge'], Literal['z1d.12xlarge'], Literal['z1d.metal'], Literal['u-6tb1.metal'], Literal['u-9tb1.metal'], Literal['u-12tb1.metal'], Literal['u-18tb1.metal'], Literal['u-24tb1.metal'], Literal['a1.medium'], Literal['a1.large'], Literal['a1.xlarge'], Literal['a1.2xlarge'], Literal['a1.4xlarge'], Literal['a1.metal'], Literal['m5dn.large'], Literal['m5dn.xlarge'], Literal['m5dn.2xlarge'], Literal['m5dn.4xlarge'], Literal['m5dn.8xlarge'], Literal['m5dn.12xlarge'], Literal['m5dn.16xlarge'], Literal['m5dn.24xlarge'], Literal['m5n.large'], Literal['m5n.xlarge'], Literal['m5n.2xlarge'], Literal['m5n.4xlarge'], Literal['m5n.8xlarge'], Literal['m5n.12xlarge'], Literal['m5n.16xlarge'], Literal['m5n.24xlarge'], Literal['r5dn.large'], Literal['r5dn.xlarge'], Literal['r5dn.2xlarge'], Literal['r5dn.4xlarge'], Literal['r5dn.8xlarge'], Literal['r5dn.12xlarge'], Literal['r5dn.16xlarge'], Literal['r5dn.24xlarge'], Literal['r5n.large'], Literal['r5n.xlarge'], Literal['r5n.2xlarge'], Literal['r5n.4xlarge'], Literal['r5n.8xlarge'], Literal['r5n.12xlarge'], Literal['r5n.16xlarge'], Literal['r5n.24xlarge'], Literal['inf1.xlarge'], Literal['inf1.2xlarge'], Literal['inf1.6xlarge'], Literal['inf1.24xlarge']]
  • return value: UnionType.make_union(unchanged_copy_of_items_argument, line, column)
  • gets called many times, about 0.5 seconds each

If there's no way to get rid of the nested for loops over items or call this function less frequently, then maybe special case for lots of Literal[string]s?

I don't feel confident with trying to actually fix this because I don't know what this function is supposed to do, why it has nested loops or why it's called so often.

@Akuli
Copy link
Contributor

Akuli commented Jul 21, 2020

Some random thoughts:

  • To me it feels very weird that this union simplification function is called so often for the same Literal (which is implicitly a union of single-argument Literals). Shouldn't it be called only once for every Literal found in the source code? The correct solution might not be adding a cache around this but rather rearranging code in ways that I probably couldn't do without spending a lot of time getting familiar with mypy internals.
  • Even if the function is called only once for every multi-argument Literal, 0.5s is way too long imo. Imagine a long program with lots of Literals with 270 strings in each and 0.5s of mypy time spent on each.

IMO the correct way to fix this would be to call the function less often AND make the function run faster when called with many Literals of strings.

@Akuli
Copy link
Contributor

Akuli commented Jul 29, 2020

this makes mypy unusable with those aws related stubs, why does this have lower priority than "crashes on invalid syntax" bugs that only a few people have ran into?

@CommanderPrS
Copy link
Author

CommanderPrS commented Jul 30, 2020

this makes mypy unusable with those aws related stubs,

You are reading my mind, @Akuli. As you correctly pointed out, even on your reduced example mypy takes 1000*0.5 sec=500 seconds to run. 8 MINUTES (!) to check a very small sample definitely puts this into the "unusable" category. As a result of this, it looks like I am going to be stuck using 0.750 for any foreseeable future :(

@Akuli
Copy link
Contributor

Akuli commented Jul 30, 2020

you can use my fix from #9192, it makes mypy usable again

@CommanderPrS
Copy link
Author

@Akuli

you can use my fix from #9192, it makes mypy usable again

If this were just locally on my computer, then yes, that would work just fine. The problem comes when other developers need to work on the same source code. Or worse, linters, mypy and tests have to run in Jenkins. There is no way I can "patch" mypy there. I can not even prepare a custom build - whatever tools we have, must come from official sources (do not ask, the security team is going crazy). In other words, it has to be an official mypy release :'(

JukkaL pushed a commit that referenced this issue Jul 31, 2020
)

make_simplified_union no longer runs for 0.5 seconds every time it's called with a 
Union containing 270 literals of strings.

Like I explained in #9169, this only fixes half of the problem and I'm not capable of 
fixing the other half. This function is still called 1098 times for the "reduced" example 
code, and IMO it should be called only once.
@Akuli
Copy link
Contributor

Akuli commented Jul 31, 2020

now it's on mypy master

huguesb added a commit to huguesb/mypy that referenced this issue Sep 1, 2020
In PR python#9192 a fast path was created to address the slowness reported
in issue python#9169 wherein large Union or literal types would dramatically
slow down typechecking.

It is desirable to extend this fast path to cover Enum types, as these
can also leverage the O(n) set-based fast path instead of the O(n**2)
fallback.

This is seen to bring down the typechecking of a single fairly simple
chain of `if` statements operating on a large enum (~3k members) from
~40min to 12s in real-world code! Note that the timing is taken from
a pure-python run of mypy, as opposed to a compiled version.
huguesb added a commit to huguesb/mypy that referenced this issue Sep 1, 2020
In PR python#9192 a fast path was created to address the slowness reported
in issue python#9169 wherein large Union or literal types would dramatically
slow down typechecking.

It is desirable to extend this fast path to cover Enum types, as these
can also leverage the O(n) set-based fast path instead of the O(n**2)
fallback.

This is seen to bring down the typechecking of a single fairly simple
chain of `if` statements operating on a large enum (~3k members) from
~40min to 12s in real-world code! Note that the timing is taken from
a pure-python run of mypy, as opposed to a compiled version.
huguesb added a commit to huguesb/mypy that referenced this issue Sep 2, 2020
In PR python#9192 a fast path was created to address the slowness reported
in issue python#9169 wherein large Union or literal types would dramatically
slow down typechecking.

It is desirable to extend this fast path to cover Enum types, as these
can also leverage the O(n) set-based fast path instead of the O(n**2)
fallback.

This is seen to bring down the typechecking of a single fairly simple
chain of `if` statements operating on a large enum (~3k members) from
~40min to 12s in real-world code! Note that the timing is taken from
a pure-python run of mypy, as opposed to a compiled version.
huguesb added a commit to huguesb/mypy that referenced this issue Sep 2, 2020
In PR python#9192 a fast path was created to address the slowness reported
in issue python#9169 wherein large Union or literal types would dramatically
slow down typechecking.

It is desirable to extend this fast path to cover Enum types, as these
can also leverage the O(n) set-based fast path instead of the O(n**2)
fallback.

This is seen to bring down the typechecking of a single fairly simple
chain of `if` statements operating on a large enum (~3k members) from
~40min to 12s in real-world code! Note that the timing is taken from
a pure-python run of mypy, as opposed to a compiled version.
huguesb added a commit to huguesb/mypy that referenced this issue Sep 2, 2020
In PR python#9192 a fast path was created to address the slowness reported
in issue python#9169 wherein large Union or literal types would dramatically
slow down typechecking.

It is desirable to extend this fast path to cover Enum types, as these
can also leverage the O(n) set-based fast path instead of the O(n**2)
fallback.

This is seen to bring down the typechecking of a single fairly simple
chain of `if` statements operating on a large enum (~3k members) from
~40min to 12s in real-world code! Note that the timing is taken from
a pure-python run of mypy, as opposed to a compiled version.
huguesb added a commit to huguesb/mypy that referenced this issue Sep 3, 2020
In PR python#9192 a fast path was created to address the slowness reported
in issue python#9169 wherein large Union or literal types would dramatically
slow down typechecking.

It is desirable to extend this fast path to cover Enum types, as these
can also leverage the O(n) set-based fast path instead of the O(n**2)
fallback.

This is seen to bring down the typechecking of a single fairly simple
chain of `if` statements operating on a large enum (~3k members) from
~40min to 12s in real-world code! Note that the timing is taken from
a pure-python run of mypy, as opposed to a compiled version.
huguesb added a commit to huguesb/mypy that referenced this issue Sep 3, 2020
In PR python#9192 a fast path was created to address the slowness reported
in issue python#9169 wherein large Union or literal types would dramatically
slow down typechecking.

It is desirable to extend this fast path to cover Enum types, as these
can also leverage the O(n) set-based fast path instead of the O(n**2)
fallback.

This is seen to bring down the typechecking of a single fairly simple
chain of `if` statements operating on a large enum (~3k members) from
~40min to 12s in real-world code! Note that the timing is taken from
a pure-python run of mypy, as opposed to a compiled version.
@hauntsaninja
Copy link
Collaborator

Closing, since #9192 was merged

@Akuli
Copy link
Contributor

Akuli commented Oct 2, 2020

@hauntsaninja Are you sure that this should be closed? #9192 is an ugly workaround that doesn't fix the underlying problem introduced in #8077.

@hauntsaninja
Copy link
Collaborator

Sure, I can reopen. #8624 also seems related.

@hauntsaninja hauntsaninja reopened this Oct 2, 2020
msullivan pushed a commit that referenced this issue Jul 7, 2021
In PR #9192 a fast path was created to address the slowness reported
in issue #9169 wherein large Union or literal types would dramatically
slow down typechecking.

It is desirable to extend this fast path to cover Enum types, as these
can also leverage the O(n) set-based fast path instead of the O(n**2)
fallback.

This is seen to bring down the typechecking of a single fairly simple
chain of `if` statements operating on a large enum (~3k members) from
~40min to 12s in real-world code! Note that the timing is taken from
a pure-python run of mypy, as opposed to a compiled version.
@bogdandm
Copy link

bogdandm commented Oct 25, 2021

I faced the same issue recently. We have huge Literal with list of project features. Currently it have 269 items. I track problem to this line of code:

# mypy.meet.TypeMeetVisitor.visit_union_type
    def visit_union_type(self, t: UnionType) -> ProperType:
        if isinstance(self.s, UnionType):
            meets = []  # type: List[Type]
            for x in t.items:
                for y in self.s.items:
                    meets.append(meet_types(x, y))
        else:
            meets = [meet_types(x, self.s)
                     for x in t.items]
        return make_simplified_union(meets)

It generates meets list with with 269 * 269 items most of them are empty:
image

And then it again loop it 2 times

# venv10/lib/python3.10/site-packages/mypy/typeops.py:365
for i, ti in enumerate(items):
	if i in removed: continue
	# Keep track of the truishness info for deleted subtypes which can be relevant
	cbt = cbf = False
	for j, tj in enumerate(items):
		if i != j and is_proper_subtype(tj, ti, keep_erased_types=keep_erased) and \
				is_redundant_literal_instance(ti, tj):

so it is 5 300 000 000 iterations for 1 (!) union check in code. And we have few hundreds of them...

Is there anything we can do with this <nothing> in union items before this two nested loops?

P.S. Python3.10, mypy0.910

update: I think it is fixed in master branch (and future 920 release)

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Jul 17, 2022

Closing this, since many improvements have been made in this space, e.g. #12630
While things can still be improved, #12526 is a better issue to track further improvements (e.g. it has more discussion on what it would take to have caching over make_simplified_union)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants