Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inconsistencies of known_first_party and --resolve-all-configs argument #2045

Open
tiagoraulp opened this issue Dec 16, 2022 · 2 comments · May be fixed by #2142
Open

inconsistencies of known_first_party and --resolve-all-configs argument #2045

tiagoraulp opened this issue Dec 16, 2022 · 2 comments · May be fixed by #2142

Comments

@tiagoraulp
Copy link

tiagoraulp commented Dec 16, 2022

I have a repo with this tree:

directory_root

    packageA
        setup.cfg
        module1.py
        module2.py

    packageB
        module3.py

With packageA/module1.py having the following contents:

import module2
import packageB
import numpy
import os

and packageA/setup.cfg having the following contents:

[isort]
known_first_party = bla

If I run isort with:
python -m isort --verbose --check-only packageA/module1.py packageA/module2.py packageB/module3.py
I get the following output:


                 _                 _
                (_) ___  ___  _ __| |_
                | |/ _/ / _ \/ '__  _/
                | |\__ \/\_\/| |  | |_
                |_|\___/\___/\_/   \_/

      isort your imports, so you don't have to.

                    VERSION 5.11.2

else-type place_module for module2 returned FIRSTPARTY
else-type place_module for packageB returned THIRDPARTY
else-type place_module for numpy returned THIRDPARTY
else-type place_module for os returned STDLIB
ERROR: /directory_root/packageA/module1.py Imports are incorrectly sorted and/or formatted.
SUCCESS: /directory_root/packageA/module2.py Everything Looks Good!
SUCCESS: /directory_root/packageB/module3.py Everything Looks Good!

which is what I expected. So if I run isort without --check-only option, the original packageA/module1.py file is reformatted to:

import os

import numpy
import packageB

import module2

However, I want to start having support for multiple config files in a single isort run, so I have to use the --resolve-all-configs argument:
python -m isort --verbose --resolve-all-configs --check-only packageA/module1.py packageA/module2.py packageB/module3.py
which gives me the following output:

                 _                 _
                (_) ___  ___  _ __| |_
                | |/ _/ / _ \/ '__  _/
                | |\__ \/\_\/| |  | |_
                |_|\___/\___/\_/   \_/

      isort your imports, so you don't have to.

                    VERSION 5.11.2

./packageA/setup.cfg used for file packageA/module1.py
ERROR: /directory_root/packageA/module1.py Imports are incorrectly sorted and/or formatted.
./packageA/setup.cfg used for file packageA/module2.py
default used for file packageB/module3.py

First Question:
Why is the verbose debug information different when using the --resolve-all-configs argument?
The else-type place_module for <some_import> returned <SECTION_NAME> messages are gone, as well as the SUCCESS: <some_file> Everything Looks Good! message.


Now, let's see the reformatting result when I run with argument --resolve-all-configs and without --check-only option:

import os

import module2
import numpy

import packageB

To me this is quite unexpected, packageB went from THIRDPARTY to FIRSTPARTY and module2 went from FIRSTPARTY to THIRDPARTY.


Second Question:
(related to known_first_party inconsistencies!)
Why is the sorting result different when using or not the the --resolve-all-configs argument, if the config .cfg file used is the same?
If the same packageA/setup.cfg is used, I would expect the same sorting result.


Furthermore, I noticed using --resolve-all-configs argument makes isort much slower, taking a couple seconds for such a simple repo.
However, I also noticed that adding more folders/packages makes isort even slower, even if the request is exactly the same (sorting module1/2/3 only in package A and B).
For example, if I added 20 more packages (packageC, packageD, ..., packageZ), with plenty of folders inside each one, the isort run with the --resolve-all-configs argument took almost 20 seconds, even if only 3 modules in 2 packages are being reformatted.


Third Question:
Why does isort with --resolve-all-configs argument become slower when adding more folders/packages, even if it is still sorting the same number of files?
Shouldn't the search for the closest config file happen only for the files that are requested to sort, thus making isort run time only dependent on the number of files to check/reformat instead of the repo size?


Finally, I went back to testing the isort standard behavior when using only a single config file (without the --resolve-all-configs argument).
I changed the packageA/setup.cfg file by removing the known_first_party entry, thus making it empty, which I expect to make it resort to default configuration:

[isort]

If I run isort again with:
python -m isort --verbose packageA/module1.py packageA/module2.py packageB/module3.py
I get the following output:


                 _                 _
                (_) ___  ___  _ __| |_
                | |/ _/ / _ \/ '__  _/
                | |\__ \/\_\/| |  | |_
                |_|\___/\___/\_/   \_/

      isort your imports, so you don't have to.

                    VERSION 5.11.2

else-type place_module for module2 returned THIRDPARTY
else-type place_module for packageB returned FIRSTPARTY
else-type place_module for numpy returned THIRDPARTY
else-type place_module for os returned STDLIB
Fixing /directory_root/packageA/module1.py

which reformats the original file in this way:

import os

import module2
import numpy

import packageB

Again, compared to the original configuration, packageB went from THIRDPARTY to FIRSTPARTY and module2 went from FIRSTPARTY to THIRDPARTY.


Fourth (and Final) Question:
(related to known_first_party inconsistencies!)
Why do both packageB and module2 change sections if I have a default configuration vs known_first_party = bla, where bla was not even a package existing in my repo?


For module1 formatting, I expected always to have module2 as FIRSTPARTY because it's part of the same python package and lives in the same directory, and packageB as THIRDPARTY because it's a separate package from packageA where module1 lives.
And why setting known_first_party to some specific name would have an impact? Does it replace the default FIRST_PARTY, or extend?

Additionally and probably the most important question in this issue, why does --resolve-all-configs argument change the sorting result if exactly the same config file is applied?

@tiagoraulp tiagoraulp changed the title inconsistencies of known_first_party inconsistencies of known_first_party and --resolve-all-configs argument Dec 16, 2022
potiuk added a commit to potiuk/airflow that referenced this issue Dec 18, 2022
The recent isort changed their mind on sorting the imports. This
change follows the change and bumps isort to latest released
version (isort has no install_requires on its own so bumping
min version has no effect on other dependencies)

This change adds a number of isort:skip_file, isort:off, isort:skips
in order to handle a very annoying bug in isort, that no matter how
much you try, it sometimes treat "known first party" packages
differently - depending on how many files it processes at a time.

We should be able to restore it after this bug is fixed:
PyCQA/isort#2045

This change also updates the common.sql API to skip them from isort
for the very same reason (depending on how many files are modified,
the isort order might change.
potiuk added a commit to apache/airflow that referenced this issue Dec 18, 2022
The recent isort changed their mind on sorting the imports. This
change follows the change and bumps isort to latest released
version (isort has no install_requires on its own so bumping
min version has no effect on other dependencies)

This change adds a number of isort:skip_file, isort:off, isort:skips
in order to handle a very annoying bug in isort, that no matter how
much you try, it sometimes treat "known first party" packages
differently - depending on how many files it processes at a time.

We should be able to restore it after this bug is fixed:
PyCQA/isort#2045

This change also updates the common.sql API to skip them from isort
for the very same reason (depending on how many files are modified,
the isort order might change.
gschuurman pushed a commit to gschuurman/airflow that referenced this issue Dec 19, 2022
The recent isort changed their mind on sorting the imports. This
change follows the change and bumps isort to latest released
version (isort has no install_requires on its own so bumping
min version has no effect on other dependencies)

This change adds a number of isort:skip_file, isort:off, isort:skips
in order to handle a very annoying bug in isort, that no matter how
much you try, it sometimes treat "known first party" packages
differently - depending on how many files it processes at a time.

We should be able to restore it after this bug is fixed:
PyCQA/isort#2045

This change also updates the common.sql API to skip them from isort
for the very same reason (depending on how many files are modified,
the isort order might change.
ephraimbuddy pushed a commit to apache/airflow that referenced this issue Jan 12, 2023
The recent isort changed their mind on sorting the imports. This
change follows the change and bumps isort to latest released
version (isort has no install_requires on its own so bumping
min version has no effect on other dependencies)

This change adds a number of isort:skip_file, isort:off, isort:skips
in order to handle a very annoying bug in isort, that no matter how
much you try, it sometimes treat "known first party" packages
differently - depending on how many files it processes at a time.

We should be able to restore it after this bug is fixed:
PyCQA/isort#2045

This change also updates the common.sql API to skip them from isort
for the very same reason (depending on how many files are modified,
the isort order might change.

(cherry picked from commit f115b20)
kosteev pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Mar 31, 2023
The recent isort changed their mind on sorting the imports. This
change follows the change and bumps isort to latest released
version (isort has no install_requires on its own so bumping
min version has no effect on other dependencies)

This change adds a number of isort:skip_file, isort:off, isort:skips
in order to handle a very annoying bug in isort, that no matter how
much you try, it sometimes treat "known first party" packages
differently - depending on how many files it processes at a time.

We should be able to restore it after this bug is fixed:
PyCQA/isort#2045

This change also updates the common.sql API to skip them from isort
for the very same reason (depending on how many files are modified,
the isort order might change.

GitOrigin-RevId: f115b207bc844c10569b2df6fc9acfa32a3c7f41
kosteev pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Apr 4, 2023
The recent isort changed their mind on sorting the imports. This
change follows the change and bumps isort to latest released
version (isort has no install_requires on its own so bumping
min version has no effect on other dependencies)

This change adds a number of isort:skip_file, isort:off, isort:skips
in order to handle a very annoying bug in isort, that no matter how
much you try, it sometimes treat "known first party" packages
differently - depending on how many files it processes at a time.

We should be able to restore it after this bug is fixed:
PyCQA/isort#2045

This change also updates the common.sql API to skip them from isort
for the very same reason (depending on how many files are modified,
the isort order might change.

(cherry picked from commit f115b207bc844c10569b2df6fc9acfa32a3c7f41)

GitOrigin-RevId: d7db157adf1ad1fc687ce6e203d71fab7f495f25
@sudowork
Copy link

sudowork commented Jun 6, 2023

I was running into this issue as well. I'm new to this repo, so take this with a grain of salt. After looking through the code I think the main issue is that: When using --resolve-all-configs, it resolves src_paths based on the the project root, which will default to the passed in --config-root or the current working directory.

One workaround with the current implementation (5.12.0) is to provide src_paths pointing to relative/path/to/src/files from the project root. So in your example, I believe it would work if you consistently call isort from using the same project root and if you provide:

# projectA/setup.cfg
src_paths =
  projectA

I see multiple issues with this workaround though:

  1. Now projects must be aware of its ancestry in the file system.
  2. Since most likely the use case for --resolve-all-configs is to be able to support multiple independent projects in one repo, would make sense for options like src_paths to be relative to the configuration file itself.
  3. When calling isort, you must provide a consistent config root or current working directory.

With the assumption that the most common use case for --resolve-all-configs would be to support multiple projects, my proposal would be to pass the directory of the config file when constructing the Config when the --resolve-all-configs option is specified. If doing so, then in your example, src_paths for projectA would resolve to /path/to/projectA,/path/to/projectA,src by default. With this approach though, it may break projects that depend on the old behavior.

I can put up a PR with the proposed change to see if it makes sense.

sudowork added a commit to sudowork/isort that referenced this issue Jun 6, 2023
When using `--resolve-all-configs`, there is unexpected behavior in that
`src_paths` ends up resolving relative to the project root, which
defaults to the current working directory. This results in first-party
modules being marked as third-party modules in the default case.

Under the previous implementation, one possible workaround would be to
specify the relative path to config directory (e.g.
`relative/path/to/configdir/src`). However, assuming that the most
common use of `--resolve-all-configs` is to support multiple
sub-projects in the same repository/overall directory, this workaround
would now require each sub-project to understand where it lives in the
filesystem.

This change proposes a fix that sets `directory` on the `config_data` to
be the directory containing the used configuration file if not already
set. Downstream, this directory is then used to resolve the absolute
paths specified by `src_paths`.

Fixes PyCQA#2045
@sudowork
Copy link

sudowork commented Jun 7, 2023

Also, since I was digging into this issue yesterday, I think I have the answers to your questions:

First Question:
Why is the verbose debug information different when using the --resolve-all-configs argument?
The else-type place_module for <some_import> returned <SECTION_NAME> messages are gone, as well as the SUCCESS: <some_file> Everything Looks Good! message.

The CLI flags don't get passed through when resolving the config. You can workaround this by adding verbose = true in your project's config file. I believe that #2119 fixes this.

Second Question:
(related to known_first_party inconsistencies!)
Why is the sorting result different when using or not the the --resolve-all-configs argument, if the config .cfg file used is the same?
If the same packageA/setup.cfg is used, I would expect the same sorting result.

Fourth (and Final) Question:
(related to known_first_party inconsistencies!)
Why do both packageB and module2 change sections if I have a default configuration vs known_first_party = bla, where bla was not even a package existing in my repo?

This is the issue I mentioned above where it basically will treat the current working directory as the root of the project folder. The PR I put up in #2142 resolves this and makes it behave as you and I both expected.

Third Question:
Why does isort with --resolve-all-configs argument become slower when adding more folders/packages, even if it is still sorting the same number of files?
Shouldn't the search for the closest config file happen only for the files that are requested to sort, thus making isort run time only dependent on the number of files to check/reformat instead of the repo size?

The implementation actually does a recursive walk of the file system starting from the config root. It also does not respect skips, so it will end up recursing into directories like .venv, node_modules, etc. Edit: I updated the implementation in my PR #2142 to address this and improve performance of find_all_configs by pruning out directories to skip and other small improvements.

ahidalgob pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Nov 7, 2023
The recent isort changed their mind on sorting the imports. This
change follows the change and bumps isort to latest released
version (isort has no install_requires on its own so bumping
min version has no effect on other dependencies)

This change adds a number of isort:skip_file, isort:off, isort:skips
in order to handle a very annoying bug in isort, that no matter how
much you try, it sometimes treat "known first party" packages
differently - depending on how many files it processes at a time.

We should be able to restore it after this bug is fixed:
PyCQA/isort#2045

This change also updates the common.sql API to skip them from isort
for the very same reason (depending on how many files are modified,
the isort order might change.

GitOrigin-RevId: f115b207bc844c10569b2df6fc9acfa32a3c7f41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants