Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VS installer exited with code -1 flakily when building Windows binaries #1387

Open
huydhn opened this issue Apr 19, 2023 · 5 comments
Open
Labels

Comments

@huydhn
Copy link
Contributor

huydhn commented Apr 19, 2023

I'm currently seeing quite a number of flaky failures when building Windows binaries in trunk, for example https://github.com/pytorch/pytorch/actions/runs/4744014597/jobs/8424319388

The error is pointing to this step https://github.com/pytorch/builder/blob/main/windows/internal/vs2022_install.ps1#L42 in which vs_installer.exe is installed. The exact error is VS installer exited with code -1, which should be one of [0, 3010]. I have already tried to disabled Windows Defender there (pytorch/pytorch#99389) but it doesn't seem to help.

Another minor bug is when vslogs.zip is copied at https://github.com/pytorch/builder/blob/main/windows/internal/vs2022_install.ps1#L54. The correct path should be C:\Users\${env:USERNAME}\AppData\Local\Temp\vslogs.zip as the user is now runneruser instead of circleci. This hides the above error.

cc @atalman @malfet @Blackhex

@huydhn huydhn added the bug label Apr 19, 2023
@malfet
Copy link
Contributor

malfet commented Apr 19, 2023

VS2022 should be part of AMI, sholdn't it?

@huydhn
Copy link
Contributor Author

huydhn commented Apr 19, 2023

It looks like there is a gap here. The installation script used by the AMI https://github.com/pytorch/test-infra/blob/main/aws/ami/windows/scripts/Installers/Install-VS.ps1#L34 looks older and still uses VS2019. Thus it makes sense that VS2022 is installed every time

@Blackhex
Copy link
Contributor

Blackhex commented Apr 19, 2023

Note there is a PR that should update the VS on the AMI pending pytorch/test-infra#1175. I haven't touched it for a while but I can revive it if needed.

@Blackhex
Copy link
Contributor

Also note, that thre might be a bug in collecting the VS logs that would be helpfull for reporting the issue:

The workflow compresses the logs into C:\Users\runneruser\AppData\Local\Temp\vslogs.zip file but then copy commad fails with:

Copy-Item : Cannot find path 'C:\Users\circleci\AppData\Local\Temp\vslogs.zip' because it does not exist. 

@huydhn
Copy link
Contributor Author

huydhn commented Apr 19, 2023

To summary my chat with @malfet on the issue:

  1. Does this issue only happen with VS2022? If yes, could we rollback to use VS2019 for the time being as it matches with what is currently in the AMI?
  2. Eventually we can use VS2022, but it would need to be part of the AMI (Update VS 2019 and add VS 2022 to Windows AMI test-infra#1175). cc @atalman I remember that you are testing a new Windows AMI, is this possible to include this change too?

pytorchmergebot pushed a commit to pytorch/pytorch that referenced this issue Apr 20, 2023
…90855) (#99591)

This reverts commit a88c15a.  Once we have the AMI ready, we can revert this and use VS2022 again.  This is to mitigate flaky Windows build in trunk pytorch/builder#1387.

Note that as VS2019 is already available in the current AMI, it won't be installed again per logic in https://github.com/pytorch/builder/blob/main/windows/internal/vs2019_install.ps1#L25-L29. Thus, this helps avoid the flaky installation issue.
Pull Request resolved: #99591
Approved by: https://github.com/kit1980, https://github.com/Blackhex, https://github.com/malfet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants