Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NativeAOT-LLVM compilation fails with STATUS_ACCESS_VIOLATION #2522

Open
knutwannheden opened this issue Feb 27, 2024 · 4 comments
Open

NativeAOT-LLVM compilation fails with STATUS_ACCESS_VIOLATION #2522

knutwannheden opened this issue Feb 27, 2024 · 4 comments
Labels
area-NativeAOT-LLVM LLVM generation for Native AOT compilation (including Web Assembly)

Comments

@knutwannheden
Copy link

Description

I have a simple wasi-wasm application that I compile to wasm using the NativeAOT-LLVM tooling. On my machine the compilation very often fails with the following error:

  MyTest failed with errors (47.2s) → bin\Release\net9.0\wasi-wasm\MyTest.dll
    C:\Users\Knut Wannheden\.nuget\packages\microsoft.dotnet.ilcompiler.llvm\9.0.0-alpha.1.24125.1\build\Microsoft.NETCore.Native.targets(348,5): error MSB3073: The command ""C:\Users\Knut Wannheden\.nuget\packages\runtime.win-x64.microsoft.dotnet.ilcompiler.llvm\9.0.0-alpha.1.24125.1\tools\\ilc" @"obj\Release\net9.0\wasi-wasm\native\MyTest.ilc.rsp"" exited with code -1073741819. [C:\Users\Knut Wannheden\MyTest\MyTest.csproj]

The exit code -1073741819 is apparently STATUS_ACCESS_VIOLATION.

If I retry the build a few times it will eventually succeed.

Reproduction Steps

Here is an attached zip of my project: MyTest.zip

Note that the MyTest.csproj file contains an absolute path that will have to be changed.

To run the build I run dotnet publish -r wasi-wasm --self-contained.

Expected behavior

The build should either always fail (preferably with some helpful message) or always succeed.

Actual behavior

The build often fails as described above.

Regression?

No response

Known Workarounds

Retry a few times.

Configuration

  • .NET SDK: 9.0.100-preview.1.24101.2
  • Windows 11 Pro
  • Architecture: ARM64 (running Windows emulator on OSX on Apple silicon)
  • No idea if the problem is specific to this configuration or not

Other information

No response

@MichalStrehovsky MichalStrehovsky transferred this issue from dotnet/runtime Feb 27, 2024
@MichalStrehovsky MichalStrehovsky added the area-NativeAOT-LLVM LLVM generation for Native AOT compilation (including Web Assembly) label Feb 27, 2024
@knutwannheden
Copy link
Author

Here is a corresponding dump file: ilc.exe.6592.dmp

@SingleAccretion
Copy link

The dump (along with a fuller one shared on Discord) shows that the failure is an AV on a NULL Thread inside what I am assuming is ActivationHandler (the register context of the crashing thread in both dumps is missing), happening during suspension. Some details from the full dump:

EXCEPTION_RECORD:  (.exr -1)
ExceptionAddress: 00007ff7376ba90d (ilc!PalInterlockedAnd)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 0000000000000000
   Parameter[1]: 0000000000000038
Attempt to read from address 0000000000000038

FAULTING_THREAD:  ffffffff

STACK_TEXT:  
00007ff7`376ba90d 00007ff7`376ba90d ilc!Thread::SetActivationPending+0xd

FAULTING_SOURCE_FILE:  D:\a\_work\1\s\src\coreclr\nativeaot\Runtime\thread.cpp
FAULTING_SOURCE_LINE_NUMBER:  1112

SYMBOL_NAME:  ilc!Thread::SetActivationPending+d
MODULE_NAME: ilc
IMAGE_NAME:  ilc.exe

OS_VERSION:  10.0.22621.1
OSPLATFORM_TYPE:  arm64

The context of execution here is x64 emulation on Windows ARM64 on macOS ARM64 via https://getutm.app/, so it seems likely to be a bug in one of those components (curiously, the documentation for QueueUserAPC2 mentions that special user-mode APCs are only supported on native architectures, but you would expect it to fail in a way that the suspension code detects properly).

@filipnavara
Copy link
Member

Firstly, the test project can be modified not to include the absolute path with these two changes:

  • <IlcHostPackagePath>$(Pkgruntime_win-x64_Microsoft_DotNet_ILCompiler_LLVM)</IlcHostPackagePath>
  • <PackageReference Include="runtime.win-x64.Microsoft.DotNet.ILCompiler.LLVM" Version="9.0.0-*" GeneratePathProperty="true" />

I can reproduce the crash on Windows Dev Kit machine running Windows ARM64 22631.3155. Got both full- and mini- dumps and they are equally "corrupted" when viewed in WinDBG. Seems like WinDBG just doesn't handle the x64 emulation correctly.

g_pfnQueueUserAPC2Proc is not NULL which suggests we are hitting dotnet/runtime#99033.

@filipnavara
Copy link
Member

After some more debugging (and hitting debugger bugs) it turns out that QueueUserAPC2 does work under x64 emulation, at least to some extent. It queues the APCs and they do get delivered and at least in certain cases they also get correct pointers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-NativeAOT-LLVM LLVM generation for Native AOT compilation (including Web Assembly)
Projects
None yet
Development

No branches or pull requests

4 participants