Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recusive Directory Junctions causes git clean -fd to return a warning #3358

Open
1 task done
spiltcoffee opened this issue Aug 11, 2021 · 11 comments · May be fixed by #4383
Open
1 task done

Recusive Directory Junctions causes git clean -fd to return a warning #3358

spiltcoffee opened this issue Aug 11, 2021 · 11 comments · May be fixed by #4383

Comments

@spiltcoffee
Copy link

spiltcoffee commented Aug 11, 2021

  • I was not able to find an open or closed issue matching what I'm seeing

Setup

  • Which version of Git for Windows are you using? Is it 32-bit or 64-bit?
$ git --version --build-options

git version 2.32.0.windows.2
cpu: x86_64
built from commit: 3d45ac813c4adf97fe3733c1f763ab6617d5add5
sizeof-long: 4
sizeof-size_t: 8
shell-path: /bin/sh
feature: fsmonitor--daemon
  • Which version of Windows are you running? Vista, 7, 8, 10? Is it 32-bit or 64-bit?
$ cmd.exe /c ver

Microsoft Windows [Version 10.0.19043.1110]
  • What options did you set as part of the installation? Or did you choose the
    defaults?
# One of the following:
> type "C:\Program Files\Git\etc\install-options.txt"
> type "C:\Program Files (x86)\Git\etc\install-options.txt"
> type "%USERPROFILE%\AppData\Local\Programs\Git\etc\install-options.txt"
$ cat /etc/install-options.txt

Editor Option: VIM
Custom Editor Path:
Default Branch Option:
Path Option: Cmd
SSH Option: OpenSSH
Tortoise Option: false
CURL Option: OpenSSL
CRLF Option: CRLFAlways
Bash Terminal Option: MinTTY
Git Pull Behavior Option: Merge
Use Credential Manager: Core
Performance Tweaks FSCache: Enabled
Enable Symlinks: Enabled
Enable Pseudo Console Support: Disabled
Enable FSMonitor: Disabled
  • Any other interesting things about your environment that might be related
    to the issue you're seeing?

Not that I'm aware of

Details

  • Which terminal/shell are you running Git from? e.g Bash/CMD/PowerShell/other

Bash and CMD

@echo off
REM using cmd.exe/as a .bat script:

cd /D %TEMP%
mkdir recursive-delete
cd recursive-delete
git init
mkdir a
cd a
mkdir b
cd b
mklink /D /J a ..
cd %TEMP%/recursive-delete
git clean -fd
  • What did you expect to occur after running these commands?

Just the following output:

Removing a/
  • What actually happened instead?

A warning appeared beforehand:

warning: could not open directory 'a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/': Function not implemented
Removing a/
  • If the problem was occurring with a specific repository, can you provide the
    URL to that repository to help us with testing?

N/A

@dscho
Copy link
Member

dscho commented Aug 16, 2021

This warning comes from the part of the code that enumerates which files are there, and what their respective status is (tracked, ignored, untracked).

The underlying problem is that the reparse point cannot be resolved due to an infinite recursion. I have this tentative diff locally:

diff --git a/compat/mingw.c b/compat/mingw.c
index cd046da97af..c9843fece4e 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -144,6 +144,7 @@ int err_win_to_posix(DWORD winerr)
 	case ERROR_WAIT_NO_CHILDREN: error = ECHILD; break;
 	case ERROR_WRITE_FAULT: error = EIO; break;
 	case ERROR_WRITE_PROTECT: error = EROFS; break;
+	case ERROR_CANT_RESOLVE_FILENAME: error = ELOOP; break;
 	}
 	return error;
 }

The problem is that strerror(ELOOP) prints out an unhelpful "Unknown error" on Windows:

warning: could not open directory 'a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/': Unknown error

I am not quite sure how best to proceed from here. Currently, my favorite solution would involve introducing platform-specific platform_strerror(int errno) that can override/extend strerror(). While it is my favorite solution, I still find it very inelegant.

@dscho
Copy link
Member

dscho commented Aug 16, 2021

BTW the problem has very little to do with the git clean command itself: even a git status triggers the error.

@rimrul
Copy link
Member

rimrul commented Aug 17, 2021

The problem is that strerror(ELOOP) prints out an unhelpful "Unknown error" on Windows:

warning: could not open directory 'a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/': Unknown error

I am not quite sure how best to proceed from here. Currently, my favorite solution would involve introducing platform-specific platform_strerror(int errno) that can override/extend strerror(). While it is my favorite solution, I still find it very inelegant.

Honestly, the whole concept of mappping GetLastError() to errno is inelegant in the fact it loses precision. I'm pretty sure FormatMessage() wouldn't produce "Unknown Error" or "Function not implemented" for ERROR_CANT_RESOLVE_FILENAME.

@dscho
Copy link
Member

dscho commented Aug 17, 2021

Honestly, the whole concept of mappping GetLastError() to errno is inelegant in the fact it loses precision.

Agreed.

Introducing an abstraction layer that does not assume that everything has a POSIX errno is not going to be a small project, though:

$ git grep 'errno = ' | wc -l
257

@jeffrson
Copy link

I have a similar issue when using pnpm (a nodejs package manager) with workspaces - which may be completely different. Can open another issue if needed.

pnpm creates a symlinked structure (junctions on Windows) inside node_modules. git clean -ffdx . reports warnings like this:

warning: could not open directory 'node_modules/copy-webpack-plugin/node_modules/webpack/node_modules/terser-webpack-plugin/node_modules/webpack/node_modules/terser-webpack-plugin/node_modules/webpack/node_modules/terser-webpack-plugin/node_modules/webpack/node_modules/terser-webpack-plugin/node_modules/webpack/node_modules/terser-webpack-plugin/node_modules/webpack/node_modules/terser-webpack-plugin/node_modules/webpack/node_modules/terser-webpack-plugin/node_modules/webpack/node_modules/terser-webpack-plugin/node_modules/webpack/node_modules/terser-webpack-plugin/node_modules/webpack/node_modules/terser-webpack-plugin/node_modules/webpack/node_modules/terser-webpack-plugin/node_modules/webpack/node_modules/terser-webpack-plugin/node_modules/webpack/node_modules/@webassemblyjs/wasm-edit/node_modules/@webassemblyjs/helper-wasm-section/node_modules/@webassemblyjs/wasm-gen/node_modules/@webassemblyjs/ast/node_modules/@webassemblyjs/helper-numbers/node_modules/@webassemblyjs/floating-point-hex-parser/': Function not implemented

I'm not sure, whether the folder might look like "recursive" - directory enumeration above is not.

However, most notably, "node_modules" is in .gitignore - git status is fine, and so I'd think git clean -ffdx should just remove any junction as soon as it finds one, without traversing.

@dscho
Copy link
Member

dscho commented Mar 10, 2022

[...]/terser-webpack-plugin/node_modules/webpack/node_modules/terser-webpack-plugin/node_modules/webpack/node_modules/terser-webpack-plugin/[...]

That almost looks like a symlink loop, but then it ends in:

[...]/@webassemblyjs/wasm-gen/node_modules/@webassemblyjs/ast/node_modules/@webassemblyjs/helper-numbers/node_modules/@webassemblyjs/floating-point-hex-parser/

In any case, it is an awfully long path. I am just surprised that the error message suggests ENOSYS when it probably should be ENAMETOOLONG. I guess The reason for that is that we default to ENOSYS when encountering unknown error conditions.

Now, it would be really interesting if you could build Git for Windows, patch compat/mingw.c to report the unknown error number (via inserting default: error("unknown winerr: %ld", winerr); break; into that long `switch statement) and then tell us that number, so that we can fix the error message.

You could also see whether my tentative diff would fix the problem on your side.

@jakebailey
Copy link

I'm working on converting a big monorepo to pnpm and I have intentionally introduced cyclic symlinks (kinda like the above), and now hit this problem with git clean where it seems to keep walking down the symlinks in Windows.

Is there anything I can try here? Are the diffs provided above still accurate?

@jakebailey
Copy link

jakebailey commented Apr 11, 2023

Testing the "tentative diff" (#3358 (comment)), I get:

.../@types/node/node_modules/@types/node/node_modules/@types/node/': Unknown error

As expected, it seems.

But, why does git walk down the entire loop? It doesn't behave like this on the same structure in Linux/macOS; the clean properly deletes the directory without recursing. It seems like this patch is just intending to change the error, but as noted in previous replies, it seems like it shouldn't be recursing at all.

@jakebailey
Copy link

Also, going via WSL2 into the same dir behaves properly; so the symlink information and not recursing seems to work, but not in the windows version of git, it seems.

@jakebailey
Copy link

jakebailey commented Apr 12, 2023

I've been digging, and I think I've gotten to the root. finddata2dirent checks for IO_REPARSE_TAG_SYMLINK, but the symlinks created by pnpm are IO_REPARSE_TAG_MOUNT_POINT. The latter seems to be more commonly associated with git's symlink handling and switching the check makes it match the code in mingw_is_mount_point exactly.

I am not an expert in these syscalls, though, but just this fixes the problem for me:

diff --git a/compat/win32/dirent.c b/compat/win32/dirent.c
index 87063101f5..3e543b51ef 100644
--- a/compat/win32/dirent.c
+++ b/compat/win32/dirent.c
@@ -19,7 +19,7 @@ static inline void finddata2dirent(struct dirent *ent, WIN32_FIND_DATAW *fdata)
 
 	/* Set file type, based on WIN32_FIND_DATA */
 	if ((fdata->dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT)
-			&& fdata->dwReserved0 == IO_REPARSE_TAG_SYMLINK)
+			&& fdata->dwReserved0 == IO_REPARSE_TAG_MOUNT_POINT)
 		ent->d_type = DT_LNK;
 	else if (fdata->dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)
 		ent->d_type = DT_DIR;

@jakebailey
Copy link

I've sent #4383 with a test which shows the behavior change. I'm definitely interested in some expert opinions here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants