Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare each .gitignore found with an appropiate relative path #3338

Merged
merged 6 commits into from Nov 8, 2022

Conversation

aaossa
Copy link
Contributor

@aaossa aaossa commented Oct 17, 2022

Description

Fix #3315 : Fix incorrectly applied .gitignore rules by considering the .gitignore location and the relative path to the target file. Commit message of the commit that fixed the bug:

When a .gitignore file contains the special rule to ignore every subfolder content (*/*) and the file is located in a subfolder
relative to where the command is executed (root), the rule is incorrectly applied and ignores every file at the same level of the .gitignore file.

The reason for this is that the gitignore variable accumulates the rules found in each .gitignore while traversing files and directories recursively. This makes sense and, in general, works as expected. The problem is that the gitignore rules are applied using as the root of the target directory as a reference. This is the cause of the bug.

The implemented solution keeps track of every .gitignore file found while traversing the targets and the location of each .gitignore. Then, when matching files to the .gitignore rules, compare each set of rules with the appropiate relative path to the candidate target file.

To make this possible, we changed the single gitignore object with a dictionary of similar objects, where the corresponding key is the path to the folder that contains that .gitignore file. This required changing the signature of the get_sources function. Also, we introduce a is_ignored function that compares a file with every set of rules. Finally, some tests required an update to pass the gitignore object in the new format.

Checklist - did you ...

  • Add an entry in CHANGES.md if necessary?
  • Add / update tests if necessary?
  • Add new / update outdated documentation?

@aaossa aaossa force-pushed the issue-3315 branch 2 times, most recently from 10dc09c to e4bae08 Compare October 17, 2022 19:54
@github-actions
Copy link

github-actions bot commented Oct 17, 2022

diff-shades reports zero changes comparing this PR (c09a816) to main (0e9d29a).


What is this? | Workflow run | diff-shades documentation

@ichard26 ichard26 self-requested a review October 18, 2022 01:18
src/black/files.py Outdated Show resolved Hide resolved
@aaossa aaossa marked this pull request as draft October 20, 2022 15:30
@aaossa
Copy link
Contributor Author

aaossa commented Oct 20, 2022

Converting to draft because I'm getting strange results locally

src/black/files.py Outdated Show resolved Hide resolved
@aaossa aaossa marked this pull request as ready for review October 20, 2022 16:56
@aaossa
Copy link
Contributor Author

aaossa commented Oct 20, 2022

This is PR is back up again and ready for review 👍

@@ -660,14 +662,14 @@ def get_sources(
elif p.is_dir():
if exclude is None:
exclude = re_compile_maybe_verbose(DEFAULT_EXCLUDES)
gitignore = get_gitignore(root)
root_gitignore = get_gitignore(root)
p_gitignore = get_gitignore(p)
# No need to use p's gitignore if it is identical to root's gitignore
# (i.e. root and p point to the same directory).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there are more than two levels? For example, we're formatting a/b/c/ and a/.gitignore/, a/b/.gitignore, and a/b/c/.gitignore all exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Right now, if the command is black a/b/c/ then a/b/.gitignore is completely skipped while a/.gitignore and a/b/c/.gitignore are captured by root_gitignore and p_gitignore, respectively. What's the expected behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a separate issue/PR? If its an undefined behavior, then it should be documented properly I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there are more than two levels? For example, we're formatting a/b/c/ and a/.gitignore/, a/b/.gitignore, and a/b/c/.gitignore all exist.

The current behavior seems to be that black a/b/c only detects a/b/c/.gitignore and completely ignores other gitignore files. This PR implements the same behavior

src/black/files.py Outdated Show resolved Hide resolved
@@ -198,7 +198,7 @@ def gen_python_files(
extend_exclude: Optional[Pattern[str]],
force_exclude: Optional[Pattern[str]],
report: Report,
gitignore: Optional[PathSpec],
gitignore: Optional[Dict[Path, PathSpec]],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would suggest renaming this to gitignore_dict for clarity. Also, no reason to keep this Optional, we can just use an empty dict if there's no gitignores.

Copy link
Contributor Author

@aaossa aaossa Oct 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gitignore can be None if exclude is not None. In that case, no gitignore is considered at any level. I though that this is the expected behavior while reading the code. If get_sources receives a value for exclude, what's the expected behavior?

About renaming the variable: I agree 👍

src/black/files.py Outdated Show resolved Hide resolved
@aaossa aaossa marked this pull request as draft November 2, 2022 03:05
@aaossa aaossa force-pushed the issue-3315 branch 2 times, most recently from 50661f4 to f97cccd Compare November 2, 2022 19:41
When a .gitignore file contains the special rule to ignore every
subfolder content (`*/*`) and the file is located in a subfolder
relative to where the command is executed (root), the rule is
incorrectly applied and ignores every file at the same level of the
.gitignore file.

The reason for this is that the `gitignore` variable accumulates the
rules found in each .gitignore while traversing files and directories
recursively. This makes sense and, in general, works as expected. The
problem is that the gitignore rules are applied using as the relative
path from root to target directory as a reference. This is the cause
of the bug.

The implemented solution keeps track of every .gitignore file found
while traversing the targets and the absolute location of each
.gitignore file. Then, when matching files to the .gitignore rules,
compare each set of rules with the appropiate relative path to the
candidate target file.

To make this possible, we changed the single `gitignore` object with a
dictionary of similar objects, where the corresponding key is the
absolute path to the folder that contains that .gitignore file. This
required changing the signature of the `get_sources` function. Also, we
introduce a `is_ignored` function that compares a file with every set
of rules. Finally, some tests required an update to pass the gitignore
object in the new format.

Signed-off-by: Antonio Ossa Guerra <aaossa@uc.cl>
The test contains three cases: 1) when the .gitignore with the special
rule to ignore every subfolder and its contents (*/*) is in the root,
2) when the file is inside a subfolder relative to root (nested), and
3) when the target folder contains the .gitignore and root is a parent
folder of the target. In all of these cases, we compare the files that
are visible by Black with a known list of paths containing the
expected values.

Before the fix introduced in the previous commit, these tests failed
when the .gitignore file was nested (second case). Now, the test is
passed for all cases.

Signed-off-by: Antonio Ossa Guerra <aaossa@uc.cl>
Add entry about fixed bug and changes introduced: ignore files by
considering the location of each .gitignore file and the relative path
of each target

Signed-off-by: Antonio Ossa Guerra <aaossa@uc.cl>
@aaossa aaossa marked this pull request as ready for review November 2, 2022 20:02
@aaossa aaossa requested review from JelleZijlstra and removed request for ichard26 November 2, 2022 20:04
src/black/__init__.py Outdated Show resolved Hide resolved
@aaossa aaossa marked this pull request as draft November 5, 2022 12:41
if relative_path is None:
break
if _gitignore is not None and _gitignore.match_file(relative_path):
report.path_ignored(child, "matches the .gitignore file content")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe include the (full) path to the .gitignore file that matched?

These changes are small improvements to improve code readability:
rename a variable to a more descriptive name (from `exclude_is_None`
to `using_default_exclude`), use a better syntax to include the type
annotation for `gitignore` variable (from typing comment to
Python-style typing annotation), and replace an if-else block with a
single dictionary definition (in this case, we need to compare keys
instead of values, meaning that the change works)

Signed-off-by: Antonio Ossa Guerra <aaossa@uc.cl>
@aaossa aaossa marked this pull request as ready for review November 7, 2022 17:57
@aaossa aaossa force-pushed the issue-3315 branch 2 times, most recently from f257725 to c09a816 Compare November 8, 2022 13:12
The function to match a given path with every discovered .gitignore
file does not need to be a nested function and can be a top-level
function. The arguments did not change, but the naming of local
variables was improved for readability.

Signed-off-by: Antonio Ossa Guerra <aaossa@uc.cl>
@JelleZijlstra JelleZijlstra merged commit ffaaf48 into psf:main Nov 8, 2022
@aaossa aaossa deleted the issue-3315 branch November 8, 2022 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrectly ignored files when subfolder/.gitignore contains */*
2 participants