Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion failure 0 <= fd && fd < sysconf(_SC_OPEN_MAX) in System.Net.Mail.Functional.Tests #72830

Closed
noahfalk opened this issue Jul 26, 2022 · 35 comments · Fixed by #76361
Closed
Assignees
Labels
area-System.Net.Sockets Known Build Error Use this to report build issues in the .NET Helix tab os-linux Linux OS (any supported distro) os-mac-os-x macOS aka OSX
Milestone

Comments

@noahfalk
Copy link
Member

noahfalk commented Jul 26, 2022

Description

System.Net.Mail.Functional.Tests are failing with this assert in CI:

https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-72664-merge-3f079befb6de4fac81/System.Net.Mail.Functional.Tests/1/console.08ced1c9.log?helixlogtype=result

----- start Fri 22 Jul 2022 12:11:04 PM UTC =============== To repro directly: =====================================================
pushd .
/root/helix/work/correlation/dotnet exec --runtimeconfig System.Net.Mail.Functional.Tests.runtimeconfig.json --depsfile System.Net.Mail.Functional.Tests.deps.json xunit.console.dll System.Net.Mail.Functional.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing 
popd
===========================================================================================================
/root/helix/work/workitem/e /root/helix/work/workitem/e
  Discovering: System.Net.Mail.Functional.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.Net.Mail.Functional.Tests (found 154 of 156 test cases)
  Starting:    System.Net.Mail.Functional.Tests (parallel test collections = on, max threads = 2)
    System.Net.Mail.Tests.SmtpClientTest.TestGssapiAuthentication [SKIP]
      Condition(s) not met: "IsNtlmInstalled"
dotnet: /__w/1/s/src/native/libs/Common/pal_utilities.h:86: int ToFileDescriptor(intptr_t): Assertion `0 <= fd && fd < sysconf(_SC_OPEN_MAX)' failed.

Reproduction Steps

Example CI build: https://dev.azure.com/dnceng/public/_build/results?buildId=1897299&view=ms.vss-test-web.build-test-results-tab

Expected behavior

Test doesn't fail in CI

Actual behavior

Test does fail in CI, see description.

Regression?

Unknown

Known Workarounds

Unknown

Configuration

Linux Debug x64 Mono Interpreter

Other information

No response

{ "ErrorMessage":"0 <= fd && fd < sysconf(_SC_OPEN_MAX)" } 

Report

Build Definition Test Pull Request
47765 dotnet/runtime System.Net.Mail.Functional.Tests.WorkItemExecution #76871
37711 dotnet/runtime System.Net.Mail.Functional.Tests.WorkItemExecution
36085 dotnet/runtime System.Net.Mail.Functional.Tests.WorkItemExecution
33387 dotnet/runtime System.Net.Mail.Functional.Tests.WorkItemExecution

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
0 0 4
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 26, 2022
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jul 26, 2022
@ghost
Copy link

ghost commented Jul 26, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

System.Net.Mail.Functional.Tests are failing with this assert in CI:

https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-72664-merge-3f079befb6de4fac81/System.Net.Mail.Functional.Tests/1/console.08ced1c9.log?helixlogtype=result

----- start Fri 22 Jul 2022 12:11:04 PM UTC =============== To repro directly: =====================================================
pushd .
/root/helix/work/correlation/dotnet exec --runtimeconfig System.Net.Mail.Functional.Tests.runtimeconfig.json --depsfile System.Net.Mail.Functional.Tests.deps.json xunit.console.dll System.Net.Mail.Functional.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing 
popd
===========================================================================================================
/root/helix/work/workitem/e /root/helix/work/workitem/e
  Discovering: System.Net.Mail.Functional.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.Net.Mail.Functional.Tests (found 154 of 156 test cases)
  Starting:    System.Net.Mail.Functional.Tests (parallel test collections = on, max threads = 2)
    System.Net.Mail.Tests.SmtpClientTest.TestGssapiAuthentication [SKIP]
      Condition(s) not met: "IsNtlmInstalled"
dotnet: /__w/1/s/src/native/libs/Common/pal_utilities.h:86: int ToFileDescriptor(intptr_t): Assertion `0 <= fd && fd < sysconf(_SC_OPEN_MAX)' failed.

Reproduction Steps

Example CI build: https://dev.azure.com/dnceng/public/_build/results?buildId=1897299&view=ms.vss-test-web.build-test-results-tab

Expected behavior

Test doesn't fail in CI

Actual behavior

Test does fail in CI, see description.

Regression?

Unknown

Known Workarounds

Unknown

Configuration

Linux Debug x64 Mono Interpreter

Other information

No response

Author: noahfalk
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@EgorBo
Copy link
Member

EgorBo commented Jul 26, 2022

Judging by the stacktrace and the job itself it's mono-interp

@EgorBo EgorBo added area-Codegen-Interpreter-mono and removed area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Jul 26, 2022
@ghost
Copy link

ghost commented Jul 26, 2022

Tagging subscribers to this area: @BrzVlad
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

System.Net.Mail.Functional.Tests are failing with this assert in CI:

https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-72664-merge-3f079befb6de4fac81/System.Net.Mail.Functional.Tests/1/console.08ced1c9.log?helixlogtype=result

----- start Fri 22 Jul 2022 12:11:04 PM UTC =============== To repro directly: =====================================================
pushd .
/root/helix/work/correlation/dotnet exec --runtimeconfig System.Net.Mail.Functional.Tests.runtimeconfig.json --depsfile System.Net.Mail.Functional.Tests.deps.json xunit.console.dll System.Net.Mail.Functional.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing 
popd
===========================================================================================================
/root/helix/work/workitem/e /root/helix/work/workitem/e
  Discovering: System.Net.Mail.Functional.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.Net.Mail.Functional.Tests (found 154 of 156 test cases)
  Starting:    System.Net.Mail.Functional.Tests (parallel test collections = on, max threads = 2)
    System.Net.Mail.Tests.SmtpClientTest.TestGssapiAuthentication [SKIP]
      Condition(s) not met: "IsNtlmInstalled"
dotnet: /__w/1/s/src/native/libs/Common/pal_utilities.h:86: int ToFileDescriptor(intptr_t): Assertion `0 <= fd && fd < sysconf(_SC_OPEN_MAX)' failed.

Reproduction Steps

Example CI build: https://dev.azure.com/dnceng/public/_build/results?buildId=1897299&view=ms.vss-test-web.build-test-results-tab

Expected behavior

Test doesn't fail in CI

Actual behavior

Test does fail in CI, see description.

Regression?

Unknown

Known Workarounds

Unknown

Configuration

Linux Debug x64 Mono Interpreter

Other information

No response

Author: noahfalk
Assignees: -
Labels:

untriaged, area-Codegen-Interpreter-mono

Milestone: -

@noahfalk noahfalk added the blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' label Jul 26, 2022
@danmoseley
Copy link
Member

I pasted stacks here. It appears that the mail code, or underlying networking code, is attempting to use a file descriptor of -1, which I assume is invalid.
#72818 (comment)

@danmoseley danmoseley changed the title Assertion failure 0 <= fd && fd < sysconf(_SC_OPEN_MAX) Assertion failure 0 <= fd && fd < sysconf(_SC_OPEN_MAX) in System.Net.Mail.Functional.Tests Jul 26, 2022
@ghost
Copy link

ghost commented Jul 26, 2022

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

System.Net.Mail.Functional.Tests are failing with this assert in CI:

https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-72664-merge-3f079befb6de4fac81/System.Net.Mail.Functional.Tests/1/console.08ced1c9.log?helixlogtype=result

----- start Fri 22 Jul 2022 12:11:04 PM UTC =============== To repro directly: =====================================================
pushd .
/root/helix/work/correlation/dotnet exec --runtimeconfig System.Net.Mail.Functional.Tests.runtimeconfig.json --depsfile System.Net.Mail.Functional.Tests.deps.json xunit.console.dll System.Net.Mail.Functional.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing 
popd
===========================================================================================================
/root/helix/work/workitem/e /root/helix/work/workitem/e
  Discovering: System.Net.Mail.Functional.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.Net.Mail.Functional.Tests (found 154 of 156 test cases)
  Starting:    System.Net.Mail.Functional.Tests (parallel test collections = on, max threads = 2)
    System.Net.Mail.Tests.SmtpClientTest.TestGssapiAuthentication [SKIP]
      Condition(s) not met: "IsNtlmInstalled"
dotnet: /__w/1/s/src/native/libs/Common/pal_utilities.h:86: int ToFileDescriptor(intptr_t): Assertion `0 <= fd && fd < sysconf(_SC_OPEN_MAX)' failed.

Reproduction Steps

Example CI build: https://dev.azure.com/dnceng/public/_build/results?buildId=1897299&view=ms.vss-test-web.build-test-results-tab

Expected behavior

Test doesn't fail in CI

Actual behavior

Test does fail in CI, see description.

Regression?

Unknown

Known Workarounds

Unknown

Configuration

Linux Debug x64 Mono Interpreter

Other information

No response

Author: noahfalk
Assignees: -
Labels:

area-System.Net, blocking-clean-ci, untriaged

Milestone: -

@wfurt
Copy link
Member

wfurt commented Jul 26, 2022

cc: @tmds

seems like we do closing magic on invalid socket ...
(stack bellow is copy from @danmoseley post)

top of stack looks like

#10 0x00007f4c7b4ab5cd in ToFileDescriptor (fd=-1) at /__w/1/s/src/native/libs/Common/pal_utilities.h:86
#11 0x00007f4c7b4abf95 in SystemNative_FcntlGetFD (fd=-1) at /__w/1/s/src/native/libs/System.Native/pal_io.c:611
...
	  at <unknown> <0xffffffff>
	  at Fcntl:<GetFD>g____PInvoke|5_0 <0x00020>
	  at Fcntl:GetFD <0x00020>
	  at System.Net.Sockets.SafeSocketHandle:TryUnblockSocket <0x0003a>
	  at System.Net.Sockets.SafeSocketHandle:CloseAsIs <0x000f4>
	  at System.Net.Sockets.Socket:Dispose <0x00426>
	  at System.Net.Sockets.Socket:Dispose <0x000a4>
	  at System.Net.Sockets.Socket:Close <0x00098>
	  at System.Net.Sockets.TcpClient:Dispose <0x0011c>
	  at System.Net.Sockets.TcpClient:Dispose <0x0001a>
	  at System.Net.Mail.SmtpConnection:ShutdownConnection <0x00184>
	  at System.Net.Mail.SmtpConnection:Abort <0x00012>
	  at System.Net.Mail.SmtpTransport:Abort <0x00078>
	  at System.Net.Mail.SmtpClient:Abort <0x0001c>
	  at System.Net.Mail.SmtpClient:SendAsyncCancel <0x00088>
	  at <>c:<SendMailAsync>b__84_1 <0x0001c>
	  at System.Threading.CancellationTokenSource:Invoke <0x00042>
or

#9  0x00007fbc510f8102 in __GI___assert_fail (assertion=0x7fbc4dcdc8a6 "0 <= fd && fd < sysconf(_SC_OPEN_MAX)", file=0x7fbc4dcddbb1 "/__w/1/s/src/native/libs/Common/pal_utilities.h", line=86, function=0x7fbc4dcdd527 "int ToFileDescriptor(intptr_t)") at assert.c:101
#10 0x00007fbc4dceb7cd in ToFileDescriptor (fd=-1) at /__w/1/s/src/native/libs/Common/pal_utilities.h:86
#11 0x00007fbc4dcebd8e in SystemNative_SetLingerOption (socket=-1, option=0x7fbc45d84d18) at /__w/1/s/src/native/libs/System.Native/pal_networking.c:1278
...
	  at <unknown> <0xffffffff>
	  at Sys:<SetLingerOption>g____PInvoke|34_0 <0x00024>
	  at Sys:SetLingerOption <0x00068>
	  at System.Net.Sockets.SocketPal:SetLingerOption <0x00092>
	  at System.Net.Sockets.Socket:SetLingerOption <0x00022>
	  at System.Net.Sockets.Socket:SetSocketOption <0x00248>
	  at System.Net.Sockets.Socket:set_LingerState <0x00024>
	  at System.Net.Sockets.TcpClient:set_LingerState <0x00022>
	  at System.Net.Mail.SmtpConnection:ShutdownConnection <0x000f0>
	  at System.Net.Mail.SmtpConnection:Abort <0x00012>
	  at System.Net.Mail.SmtpTransport:Abort <0x00078>
	  at System.Net.Mail.SmtpClient:Abort <0x0001c>
	  at System.Net.Mail.SmtpClient:SendAsyncCancel <0x00088>
	  at <>c:<SendMailAsync>b__84_1 <0x0001c>
	  at System.Threading.CancellationTokenSource:Invoke <0x00042>

@ghost
Copy link

ghost commented Jul 26, 2022

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

System.Net.Mail.Functional.Tests are failing with this assert in CI:

https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-72664-merge-3f079befb6de4fac81/System.Net.Mail.Functional.Tests/1/console.08ced1c9.log?helixlogtype=result

----- start Fri 22 Jul 2022 12:11:04 PM UTC =============== To repro directly: =====================================================
pushd .
/root/helix/work/correlation/dotnet exec --runtimeconfig System.Net.Mail.Functional.Tests.runtimeconfig.json --depsfile System.Net.Mail.Functional.Tests.deps.json xunit.console.dll System.Net.Mail.Functional.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing 
popd
===========================================================================================================
/root/helix/work/workitem/e /root/helix/work/workitem/e
  Discovering: System.Net.Mail.Functional.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.Net.Mail.Functional.Tests (found 154 of 156 test cases)
  Starting:    System.Net.Mail.Functional.Tests (parallel test collections = on, max threads = 2)
    System.Net.Mail.Tests.SmtpClientTest.TestGssapiAuthentication [SKIP]
      Condition(s) not met: "IsNtlmInstalled"
dotnet: /__w/1/s/src/native/libs/Common/pal_utilities.h:86: int ToFileDescriptor(intptr_t): Assertion `0 <= fd && fd < sysconf(_SC_OPEN_MAX)' failed.

Reproduction Steps

Example CI build: https://dev.azure.com/dnceng/public/_build/results?buildId=1897299&view=ms.vss-test-web.build-test-results-tab

Expected behavior

Test doesn't fail in CI

Actual behavior

Test does fail in CI, see description.

Regression?

Unknown

Known Workarounds

Unknown

Configuration

Linux Debug x64 Mono Interpreter

Other information

No response

Author: noahfalk
Assignees: -
Labels:

area-System.Net.Sockets, blocking-clean-ci, untriaged

Milestone: -

@wfurt wfurt removed the untriaged New issue has not been triaged by the area owner label Jul 26, 2022
@wfurt wfurt added this to the 7.0.0 milestone Jul 26, 2022
@karelz
Copy link
Member

karelz commented Aug 3, 2022

@rzikm can you please check how often it happens? Thanks!

@rzikm
Copy link
Member

rzikm commented Aug 3, 2022

Very often, 98 hits in the last 14 days. Curiously, none of these are on main

@wfurt
Copy link
Member

wfurt commented Aug 3, 2022

Aside from some authentication changes, #70046 would be biggest suspect. It may not be root cause as the assert is in Sockets. I tried to reproduce it (main on Linux) but I did not get hit. We can probably look at some of the Linux/Windows core files to see what particular tests are running.

@karelz karelz added os-linux Linux OS (any supported distro) os-mac-os-x macOS aka OSX labels Aug 4, 2022
@karelz
Copy link
Member

karelz commented Aug 7, 2022

If it is happening that often perhaps we should disable the test for now to avoid noise in CI ... @rzikm thoughts?

@karelz
Copy link
Member

karelz commented Aug 30, 2022

Status: After re-enabling the test, we got some hits on PRs. @rzikm has actionable dump link.

@wfurt
Copy link
Member

wfurt commented Aug 30, 2022

this is from
https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-74639-merge-d6f8b011644f4edc84/System.Net.Mail.Functional.Tests/1/console.694b6d7a.log?helixlogtype=result

(lldb) clrstack -a
OS Thread Id: 0xfe8 (1)
        Child SP               IP Call Site
00007F1D4AAD3008 00007f5e86cf9387 [InlinedCallFrame: 00007f1d4aad3008] Interop+Sys.<SetLingerOption>g____PInvoke|34_0(IntPtr, LingerOption*)
00007F1D4AAD3008 00007f5e08a331ed [InlinedCallFrame: 00007f1d4aad3008] Interop+Sys.<SetLingerOption>g____PInvoke|34_0(IntPtr, LingerOption*)
00007F1D4AAD3000 00007F5E08A331ED ILStubClass.IL_STUB_PInvoke(IntPtr, LingerOption*)
    PARAMETERS:
        <no data>
        <no data>

00007F1D4AAD30D0 00007F5E08D61E09 Interop+Sys.SetLingerOption(System.Runtime.InteropServices.SafeHandle, LingerOption*) [/_/src/libraries/System.Net.Sockets/src/Microsoft.Interop.LibraryImportGenerator/Microsoft.Interop.LibraryImportGenerator/LibraryImports.g.cs @ 772]
    PARAMETERS:
        socket (0x00007F1D4AAD3118) = 0x00007f1df583ad88
        option (0x00007F1D4AAD3110) = 0x00007f1d4aad3188
    LOCALS:
        0x00007F1D4AAD3108 = 0xffffffffffffffff
        0x00007F1D4AAD3104 = 0xffffffff00007f5e
        0x00007F1D4AAD30F8 = 0x0000000000000001
        0x00007F1D4AAD30F4 = 0x0000000000000000
        0x00007F1D4AAD30F0 = 0x0000000000000000


00007F1D4AAD3130 00007F5E08D61CF0 System.Net.Sockets.SocketPal.SetLingerOption(System.Net.Sockets.SafeSocketHandle, System.Net.Sockets.LingerOption) [/_/src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketPal.Unix.cs @ 1534]
    PARAMETERS:
        handle (0x00007F1D4AAD3198) = 0x00007f1df583ad88
        optionValue (0x00007F1D4AAD3190) = 0x00007f1df584a790
    LOCALS:
        0x00007F1D4AAD3188 = 0x0000000000000001
        0x00007F1D4AAD3184 = 0x0000000100000000
        0x00007F1D4AAD3178 = 0x0000000000000001
        0x00007F1D4AAD3174 = 0x0000000100007f1d

when I dump the SafeHandle, it has reasonable value...

=============================================================================
(lldb) dumpobj 0x7f1df583ad88
Name:        System.Net.Sockets.SafeSocketHandle
MethodTable: 00007f5e08a797f8
EEClass:     00007f5e08a82d00
Size:        80(0x50) bytes
File:        /mnt/work/B1FC09CD/p/shared/Microsoft.NETCore.App/8.0.0/System.Net.Sockets.dll
Fields:
              MT    Field   Offset                 Type VT     Attr            Value Name
00007f5e07bc9a80  4001127        8        System.IntPtr  1 instance 00000000000000A3 handle
00007f5e07b3f0f0  4001128       10         System.Int32  1 instance                8 _state
00007f5e07b3bbf0  4001129       14       System.Boolean  1 instance                1 _ownsHandle
00007f5e07b3bbf0  400112a       15       System.Boolean  1 instance                1 _fullyInitialized
00007f5e08a78c20  4000106       20         System.Int32  1 instance       -559038737 _closeSocketResult
00007f5e08a78c20  4000107       24         System.Int32  1 instance       -559038737 _closeSocketLinger
00007f5e07b3f0f0  4000108       28         System.Int32  1 instance                0 _closeSocketThread
00007f5e07b3f0f0  4000109       2c         System.Int32  1 instance                0 _closeSocketTick
00007f5e07b3f0f0  400010a       30         System.Int32  1 instance                0 _ownClose
00007f5e07b3bbf0  400010b       3c       System.Boolean  1 instance                1 <OwnsHandle>k__BackingField
00007f5e07b3bbf0  400010c       3d       System.Boolean  1 instance                0 _released
00007f5e07b3bbf0  400010d       3e       System.Boolean  1 instance                0 _hasShutdownSend
00007f5e07b3f0f0  400010e       34         System.Int32  1 instance               -1 _receiveTimeout
00007f5e07b3f0f0  400010f       38         System.Int32  1 instance               -1 _sendTimeout
00007f5e07b3bbf0  4000110       3f       System.Boolean  1 instance                0 _nonBlocking
00007f5e08aae4b0  4000111       18 ...ocketAsyncContext  0 instance 00007f1df583aef8 _asyncContext
00007f5e08a796a0  4000112       16         System.Int16  1 instance                2 _trackedOptions
00007f5e07b3bbf0  4000113       40       System.Boolean  1 instance                0 <LastConnectFailed>k__BackingField
00007f5e07b3bbf0  4000114       41       System.Boolean  1 instance                1 <DualMode>k__BackingField
00007f5e07b3bbf0  4000115       42       System.Boolean  1 instance                0 <ExposedHandleOrUntrackedConfiguration>k__BackingField
00007f5e07b3bbf0  4000116       43       System.Boolean  1 instance                0 <PreferInlineCompletions>k__BackingField
00007f5e07b3bbf0  4000117       44       System.Boolean  1 instance                1 <IsSocket>k__BackingField
00007f5e07b3bbf0  4000118       45       System.Boolean  1 instance                0 <IsDisconnected>k__BackingField

any idea how that becomes invalid @jkotas ?
I could not confirm the -1 above but the first threads looks suspicious

(lldb) thread select 1
* thread #1, name = 'dotnet', stop reason = signal SIGABRT
    frame #0: 0x00007f5e86cf9387
->  0x7f5e86cf9387: cmpq   $-0x1000, %rax            ; imm = 0xF000
    0x7f5e86cf938d: ja     0x7f5e86cf93ad
    0x7f5e86cf938f: rep    retq
    0x7f5e86cf9391: nopl   (%rax)
(lldb) bt
* thread #1, name = 'dotnet', stop reason = signal SIGABRT
  * frame #0: 0x00007f5e86cf9387

since this was last reported on macOS, it seems unlikely related to #73972.

@jkotas jkotas added the Known Build Error Use this to report build issues in the .NET Helix tab label Aug 30, 2022
@jkotas
Copy link
Member

jkotas commented Aug 30, 2022

@AaronRobinsonMSFT Could you please take a look?

The SafeHandle.handle is a good value (00000000000000A3), but it somehow turned into -1 by the time it got into the P/Invoke.

@AaronRobinsonMSFT
Copy link
Member

since this was last reported on macOS

I can't seem to reproduce this on an M1. I will try Linux-x64 next.

@wfurt
Copy link
Member

wfurt commented Aug 31, 2022

We spent days with @rzikm to reproduce it without any luck @AaronRobinsonMSFT. We really only have some dumps from CI.

@liveans liveans assigned liveans and unassigned wfurt Sep 13, 2022
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Sep 29, 2022
@liveans
Copy link
Member

liveans commented Oct 16, 2022

SafeSocketHandle oldHandle = _handle;
SocketError errorCode = SocketPal.CreateSocket(_addressFamily, _socketType, _protocolType, out _handle);

I think, this is the root cause of this issue. Because it's the only place that we can set the handle to -1 temporarily (via CreateSocket) outside of the Socket constructor. In this case, we're trying to close the socket (or set linger option) while trying to replace the handle.

SafeSocketHandle oldHandle = _handle;
SafeSocketHandle newHandle;
SocketError errorCode = SocketPal.CreateSocket(_addressFamily, _socketType, _protocolType, out newHandle);
_handle = newHandle;

Something like this should fix it.

@danmoseley
Copy link
Member

I can't quite see how that fixes it. It's an out parameter, so how is what you wrote different to the existing code?

@liveans
Copy link
Member

liveans commented Oct 16, 2022

The problem is race condition, actually. At the beginning of the CreateSocket function we have something like this:

Which means we're replacing the current Socket instance's handle with default constructed SafeSocketHandle, at this point we have a chance to hit -1 as handle's file descriptor if we're running code in multi-thread/async environment.

In the same function we have another line to update file descriptor:

Edit: I deleted the wrong information

@danmoseley
Copy link
Member

Ah, I didn't realize it's multithreaded.

@wfurt
Copy link
Member

wfurt commented Oct 18, 2022

Where is the code in SendAsync @liveans? I only saw some replacement during connect. I think this can still be a problem in cancellation & cleanup kicks in on thread pool. I would be probably worth of trying proposed change.

@liveans
Copy link
Member

liveans commented Oct 18, 2022

Where is the code in SendAsync @liveans? I only saw some replacement during connect. I think this can still be a problem in cancellation & cleanup kicks in on thread pool. I would be probably worth of trying proposed change.

Yesterday evening we were discussing it with @antonfirsov as well, after that we noticed that I mistracked the path and ReplaceHandle is not using in SendAsync path (my bad), it's using via SendPacketsAsync path, but I'm still thinking the proposed change can fix this issue, because in the whole code this is the only place that we're changing the handle without using constructor.

@tmds
Copy link
Member

tmds commented Oct 18, 2022

I mistracked the path and ReplaceHandle is not using in SendAsync path

I haven't looked at the code, but I think there probably is such a path.
SmptClient SendAsync establishes the connection, and connect calls ReplaceHandle on Unix to try multiple IP addresses.

The fix in #70046 (previously mentioned by @wfurt) was about making SendAsync keep using an open connection, see #49340 (comment). So that may have triggered the issue.

It's definitely possible we seeing a race between connect replacing the handle, and SmtpClient Abort observing this half-initialized handle.

@liveans
Copy link
Member

liveans commented Oct 18, 2022

I mistracked the path and ReplaceHandle is not using in SendAsync path

I haven't looked at the code, but I think there probably is such a path.

We should double check it then, thanks for correcting me!

It's definitely possible we seeing a race between connect replacing the handle, and SmtpClient Abort observing this half-initialized handle.

Proposed fix worth to try then.

@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Oct 24, 2022
@karelz karelz modified the milestones: 7.0.0, 8.0.0 Oct 25, 2022
@karelz
Copy link
Member

karelz commented Nov 3, 2022

So far it has not been reported by external customers. The reports came in only from our CI.
It is a race condition when we close socket in parallel during its creation (being created in a batch of socket handles via array overload) -- it is a rare stress scenario with very small time window for it to happen.
Impact on customer (on release builds without asserts) -- memory leak of 1 handle.

Not worth servicing 7.0.x, until we get reports from customers.

@ghost ghost locked as resolved and limited conversation to collaborators Dec 3, 2022
@carlossanlop
Copy link
Member

@karelz @wfurt FYI this failure happened again today in 7.0. Based on the last comment, I won't reopen the issue, but I am pasting all the information here so this gets linked with the affected PR, and to preserve history.

Expand
===========================================================================================================
/root/helix/work/workitem/e /root/helix/work/workitem/e
  Discovering: System.Net.Mail.Functional.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.Net.Mail.Functional.Tests (found 153 of 156 test cases)
  Starting:    System.Net.Mail.Functional.Tests (parallel test collections = on, max threads = 2)
    System.Net.Mail.Tests.SmtpClientTest.TestGssapiAuthentication [SKIP]
      Condition(s) not met: "IsNtlmInstalled"

=================================================================
	Native Crash Reporting
=================================================================
Got a SIGABRT while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries 
used by your application.
=================================================================
dotnet: /__w/1/s/src/native/libs/Common/pal_utilities.h:86: int ToFileDescriptor(intptr_t): Assertion `0 <= fd && fd < sysconf(_SC_OPEN_MAX)' failed.

=================================================================
	Native stacktrace:
=================================================================
	0x7f68d31e0e92 - Unknown
	0x7f68d318759e - Unknown
	0x7f68d31e0768 - Unknown
	0x7f68d40a6630 - Unknown
	0x7f68d32db387 - Unknown
	0x7f68d32dca78 - Unknown
	0x7f68d32d41a6 - Unknown
	0x7f68d32d4252 - Unknown
	0x7f68d02805ed - Unknown
	0x7f68d0280fb5 - Unknown
	0x411053fb - Unknown

=================================================================
	External Debugger Dump:
=================================================================
Missing separate debuginfo for /root/helix/work/correlation/dotnet
Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/22/ad60fa877b47f26b08c103e8e928402687212b.debug
[New LWP 62]
[New LWP 58]
[New LWP 53]
[New LWP 51]
[New LWP 42]
[New LWP 41]
[New LWP 40]
[New LWP 39]
[New LWP 38]
[New LWP 37]
[New LWP 32]
[New LWP 31]
[New LWP 28]
[New LWP 27]
[New LWP 26]
[New LWP 25]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Missing separate debuginfo for /root/helix/work/correlation/host/fxr/7.0.11/libhostfxr.so
Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/19/1d9de57b7b94b2c45d49368284fd6d1b814753.debug
Missing separate debuginfo for /root/helix/work/correlation/shared/Microsoft.NETCore.App/7.0.11/libhostpolicy.so
Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/0e/81d56a9d12fffce10fbfa3704f140d23309670.debug
0x00007f68d40a2a35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  Id   Target Id         Frame 
  17   Thread 0x7f68d23ff700 (LWP 25) "SGen worker" 0x00007f68d40a2a35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  16   Thread 0x7f68d0837700 (LWP 26) "dotnet" 0x00007f68d3398ddd in poll () from /lib64/libc.so.6
  15   Thread 0x7f68d0636700 (LWP 27) "Finalizer" 0x00007f68d40a4b3b in do_futex_wait.constprop.1 () from /lib64/libpthread.so.0
  14   Thread 0x7f68c98f6700 (LWP 28) "dotnet" 0x00007f68d40a575d in read () from /lib64/libpthread.so.0
  13   Thread 0x7f68bfdfc700 (LWP 31) ".NET Long Runni" 0x00007f68d339de29 in syscall () from /lib64/libc.so.6
  12   Thread 0x7f68bf5fb700 (LWP 32) ".NET Long Runni" 0x00007f68d339de29 in syscall () from /lib64/libc.so.6
  11   Thread 0x7f68c985c700 (LWP 37) ".NET Long Runni" 0x00007f68d40a2a35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  10   Thread 0x7f68c965b700 (LWP 38) ".NET ThreadPool" 0x00007f68d40a2de2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  9    Thread 0x7f68c8073700 (LWP 39) ".NET ThreadPool" 0x00007f68d40a2a35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  8    Thread 0x7f68be5f9700 (LWP 40) ".NET ThreadPool" 0x00007f68d40a61d9 in waitpid () from /lib64/libpthread.so.0
  7    Thread 0x7f68bedfa700 (LWP 41) ".NET Long Runni" 0x00007f68d40a2de2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  6    Thread 0x7f68bc84b700 (LWP 42) ".NET Long Runni" 0x00007f68d40a2de2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  5    Thread 0x7f68bc3ff700 (LWP 51) ".NET Sockets" 0x00007f68d33a40e3 in epoll_wait () from /lib64/libc.so.6
  4    Thread 0x7f689bfff700 (LWP 53) ".NET Timer" 0x00007f68d40a2a35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  3    Thread 0x7f689bbeb700 (LWP 58) ".NET ThreadPool" 0x00007f68d40a2a35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  2    Thread 0x7f689b9ea700 (LWP 62) ".NET ThreadPool" 0x00007f68d40a2de2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
* 1    Thread 0x7f68d44c7780 (LWP 24) "dotnet" 0x00007f68d40a2a35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0

Thread 17 (Thread 0x7f68d23ff700 (LWP 25)):
#0  0x00007f68d40a2a35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f68d30ce493 in mono_os_cond_wait (cond=<optimized out>, mutex=<optimized out>) at /__w/1/s/src/mono/mono/mini/../../mono/utils/mono-os-mutex.h:219
#2  get_work (worker_index=<optimized out>, work_context=<optimized out>, do_idle=<optimized out>, job=<optimized out>) at /__w/1/s/src/mono/mono/sgen/sgen-thread-pool.c:167
#3  thread_func (data=0x0) at /__w/1/s/src/mono/mono/sgen/sgen-thread-pool.c:198
#4  0x00007f68d409eea5 in start_thread () from /lib64/libpthread.so.0
#5  0x00007f68d33a3b0d in clone () from /lib64/libc.so.6

Thread 16 (Thread 0x7f68d0837700 (LWP 26)):
#0  0x00007f68d3398ddd in poll () from /lib64/libc.so.6
#1  0x00007f68d327a41a in ipc_poll_fds (fds=<optimized out>, nfds=1, timeout=4294967295) at /__w/1/s/src/native/eventpipe/ds-ipc-pal-socket.c:470
#2  ds_ipc_poll (poll_handles_data=0x7f68cc002250, poll_handles_data_len=1, timeout_ms=4294967295, callback=0x7f68d3279850 <server_warning_callback>) at /__w/1/s/src/native/eventpipe/ds-ipc-pal-socket.c:1097
#3  0x00007f68d3277925 in ds_ipc_stream_factory_get_next_available_stream (callback=0x7f68d3279850 <server_warning_callback>) at /__w/1/s/src/native/eventpipe/ds-ipc.c:395
#4  0x00007f68d3276029 in server_thread (data=<optimized out>) at /__w/1/s/src/native/eventpipe/ds-server.c:127
#5  0x00007f68d3279831 in ep_rt_thread_mono_start_func (data=0x555fcfe68e30) at /__w/1/s/src/mono/mono/mini/../eventpipe/ep-rt-mono.h:1356
#6  0x00007f68d409eea5 in start_thread () from /lib64/libpthread.so.0
#7  0x00007f68d33a3b0d in clone () from /lib64/libc.so.6

Thread 15 (Thread 0x7f68d0636700 (LWP 27)):
#0  0x00007f68d40a4b3b in do_futex_wait.constprop.1 () from /lib64/libpthread.so.0
#1  0x00007f68d40a4bcf in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0
#2  0x00007f68d40a4c6b in sem_wait@@GLIBC_2.2.5 () from /lib64/libpthread.so.0
#3  0x00007f68d304f366 in mono_os_sem_wait (sem=<optimized out>, flags=MONO_SEM_FLAGS_ALERTABLE) at /__w/1/s/src/mono/mono/mini/../utils/mono-os-semaphore.h:204
#4  mono_coop_sem_wait (sem=<optimized out>, flags=MONO_SEM_FLAGS_ALERTABLE) at /__w/1/s/src/mono/mono/mini/../../mono/utils/mono-coop-semaphore.h:41
#5  finalizer_thread (unused=<optimized out>) at /__w/1/s/src/mono/mono/metadata/gc.c:891
#6  0x00007f68d30282da in start_wrapper_internal (start_info=0x0, stack_ptr=<optimized out>) at /__w/1/s/src/mono/mono/metadata/threads.c:1202
#7  0x00007f68d3028169 in start_wrapper (data=0x555fcfe7a7f0) at /__w/1/s/src/mono/mono/metadata/threads.c:1264
#8  0x00007f68d409eea5 in start_thread () from /lib64/libpthread.so.0
#9  0x00007f68d33a3b0d in clone () from /lib64/libc.so.6

Thread 14 (Thread 0x7f68c98f6700 (LWP 28)):
#0  0x00007f68d40a575d in read () from /lib64/libpthread.so.0
#1  0x00007f68d028e70e in SignalHandlerLoop (arg=0x555fd06b13b0) at /__w/1/s/src/native/libs/System.Native/pal_signal.c:323
#2  0x00007f68d409eea5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f68d33a3b0d in clone () from /lib64/libc.so.6

Thread 13 (Thread 0x7f68bfdfc700 (LWP 31)):
#0  0x00007f68d339de29 in syscall () from /lib64/libc.so.6
#1  0x00007f68c87289ce in ust_listener_thread () from /lib64/liblttng-ust.so.0
#2  0x00007f68d409eea5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f68d33a3b0d in clone () from /lib64/libc.so.6

Thread 12 (Thread 0x7f68bf5fb700 (LWP 32)):
#0  0x00007f68d339de29 in syscall () from /lib64/libc.so.6
#1  0x00007f68c87289ce in ust_listener_thread () from /lib64/liblttng-ust.so.0
#2  0x00007f68d409eea5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f68d33a3b0d in clone () from /lib64/libc.so.6

Thread 11 (Thread 0x7f68c985c700 (LWP 37)):
#0  0x00007f68d40a2a35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f68d028f2fc in SystemNative_LowLevelMonitor_Wait (monitor=0x7f68c01b2780) at /__w/1/s/src/native/libs/System.Native/pal_threading.c:155
#2  0x0000000040f3791c in ?? ()
#3  0x00007f68d25c1f60 in ?? ()
#4  0xffffffffffffffff in ?? ()
#5  0x00007f68bc990e30 in ?? ()
#6  0x0000000000000001 in ?? ()
#7  0x00007f68d25c1f90 in ?? ()
#8  0xffffffffffffffff in ?? ()
#9  0x00007f68bc990e30 in ?? ()
#10 0x00007f68c0169fa0 in ?? ()
#11 0x00007f68c985b8b0 in ?? ()
#12 0x00007f68c985b7c0 in ?? ()
#13 0x00007f68c00008c0 in ?? ()
#14 0x0000000040f3785c in ?? ()
#15 0x00007f68c985b930 in ?? ()
#16 0x0000000040f3781c in ?? ()
#17 0x00007f68d25c1f90 in ?? ()
#18 0x0000000040f36ea8 in ?? ()
#19 0x00007f68c985b930 in ?? ()
#20 0x0000000000000000 in ?? ()

Thread 10 (Thread 0x7f68c965b700 (LWP 38)):
#0  0x00007f68d40a2de2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f68d028f440 in SystemNative_LowLevelMonitor_TimedWait (monitor=0x7f68b8001310, timeoutMilliseconds=12000) at /__w/1/s/src/native/libs/System.Native/pal_threading.c:195
#2  0x0000000040f90e87 in ?? ()
#3  0x0000000000000000 in ?? ()

Thread 9 (Thread 0x7f68c8073700 (LWP 39)):
#0  0x00007f68d40a2a35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f68d028f2fc in SystemNative_LowLevelMonitor_Wait (monitor=0x7f68b00052f0) at /__w/1/s/src/native/libs/System.Native/pal_threading.c:155
#2  0x0000000040f3791c in ?? ()
#3  0x00007f68d25c8590 in ?? ()
#4  0xffffffffffffffff in ?? ()
#5  0x0000000000000000 in ?? ()

Thread 8 (Thread 0x7f68be5f9700 (LWP 40)):
#0  0x00007f68d40a61d9 in waitpid () from /lib64/libpthread.so.0
#1  0x00007f68d31e0fd7 in dump_native_stacktrace (signal=<optimized out>, mctx=<optimized out>) at /__w/1/s/src/mono/mono/mini/mini-posix.c:843
#2  mono_dump_native_crash_info (signal=<optimized out>, mctx=0x7f68be5f7898, info=<optimized out>) at /__w/1/s/src/mono/mono/mini/mini-posix.c:870
#3  0x00007f68d318759e in mono_handle_native_crash (signal=0x7f68d2ef9c0c "SIGABRT", mctx=0x7f68be5f7898, info=0x7f68be5f7b70) at /__w/1/s/src/mono/mono/mini/mini-exceptions.c:3005
#4  0x00007f68d31e0768 in sigabrt_signal_handler (_dummy=<optimized out>, _info=0x7f68be5f7b70, context=0x7f68be5f7a40) at /__w/1/s/src/mono/mono/mini/mini-posix.c:225
#5  <signal handler called>
#6  0x00007f68d32db387 in raise () from /lib64/libc.so.6
#7  0x00007f68d32dca78 in abort () from /lib64/libc.so.6
#8  0x00007f68d32d41a6 in __assert_fail_base () from /lib64/libc.so.6
#9  0x00007f68d32d4252 in __assert_fail () from /lib64/libc.so.6
#10 0x00007f68d02805ed in ToFileDescriptor (fd=-1) at /__w/1/s/src/native/libs/Common/pal_utilities.h:86
#11 0x00007f68d0280fb5 in SystemNative_FcntlGetFD (fd=-1) at /__w/1/s/src/native/libs/System.Native/pal_io.c:611
#12 0x00000000411053fb in ?? ()
#13 0x0000000000000000 in ?? ()

Thread 7 (Thread 0x7f68bedfa700 (LWP 41)):
#0  0x00007f68d40a2de2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f68d3079f34 in mono_os_cond_timedwait (cond=0x555fcfdaadc0, mutex=0x555fcfdaad98, timeout_ms=300000) at /__w/1/s/src/mono/mono/utils/mono-os-mutex.c:75
#2  0x00007f68d302d52a in mono_coop_cond_timedwait (cond=<optimized out>, mutex=<optimized out>, timeout_ms=300000) at /__w/1/s/src/mono/mono/mini/../../mono/utils/mono-coop-mutex.h:103
#3  mono_w32handle_timedwait_signal_naked (cond=<optimized out>, mutex=<optimized out>, timeout=300000, poll=0, alerted=<optimized out>) at /__w/1/s/src/mono/mono/metadata/w32handle.c:514
#4  mono_w32handle_timedwait_signal_handle (handle_data=<optimized out>, timeout=300000, poll=0, alerted=0x7f68bedf64d4) at /__w/1/s/src/mono/mono/metadata/w32handle.c:629
#5  0x00007f68d302d29a in mono_w32handle_wait_one (handle=<optimized out>, timeout=300000, alertable=<optimized out>) at /__w/1/s/src/mono/mono/metadata/w32handle.c:738
#6  0x00007f68d3050d54 in mono_monitor_wait (obj_handle=..., ms=<optimized out>, allow_interruption=1 '\001', error=<optimized out>) at /__w/1/s/src/mono/mono/metadata/monitor.c:1364
#7  ves_icall_System_Threading_Monitor_Monitor_wait (obj_handle=..., ms=<optimized out>, allow_interruption=1 '\001', error=<optimized out>) at /__w/1/s/src/mono/mono/metadata/monitor.c:1443
#8  0x00007f68d2fe00f3 in ves_icall_System_Threading_Monitor_Monitor_wait_raw (a0=0x7f68bedf6690, a1=300000, a2=1 '\001') at /__w/1/s/src/mono/mono/mini/../metadata/icall-def.h:570
#9  0x0000000041105bdc in ?? ()
#10 0x00007f68d25ab748 in ?? ()
#11 0x00000000000493e0 in ?? ()
#12 0x0000000000000001 in ?? ()
#13 0x00000000000493e0 in ?? ()
#14 0x00007f68d25ab748 in ?? ()
#15 0x00007f68bedf6cf8 in ?? ()
#16 0x00007f68bedf67a0 in ?? ()
#17 0x00007f68bedf6650 in ?? ()
#18 0x00007f68d25ab748 in ?? ()
#19 0x00000000411059d0 in ?? ()
#20 0x00007f68d25ab748 in ?? ()
#21 0x00000000000493e0 in ?? ()
#22 0x00007f68bedf67a0 in ?? ()
#23 0x0000000041105954 in ?? ()
#24 0x0000000000185b8b in ?? ()
#25 0x00007f68d29581b0 in ?? ()
#26 0x00000000000493e0 in ?? ()
#27 0x0000000041104c24 in ?? ()
#28 0x0000000000000000 in ?? ()

Thread 6 (Thread 0x7f68bc84b700 (LWP 42)):
#0  0x00007f68d40a2de2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f68d028f440 in SystemNative_LowLevelMonitor_TimedWait (monitor=0x7f689c002420, timeoutMilliseconds=30000) at /__w/1/s/src/native/libs/System.Native/pal_threading.c:195
#2  0x0000000040f90e87 in ?? ()
#3  0x0000000000000000 in ?? ()

Thread 5 (Thread 0x7f68bc3ff700 (LWP 51)):
#0  0x00007f68d33a40e3 in epoll_wait () from /lib64/libc.so.6
#1  0x00007f68d028a00e in WaitForSocketEventsInner (port=9, buffer=0x7f68a8276d10, count=0x7f68bc3febf8) at /__w/1/s/src/native/libs/System.Native/pal_networking.c:2723
#2  0x00007f68d0289f2f in SystemNative_WaitForSocketEvents (port=9, buffer=0x7f68a8276d10, count=0x7f68bc3febf8) at /__w/1/s/src/native/libs/System.Native/pal_networking.c:3023
#3  0x00000000410b5a79 in ?? ()
#4  0x0000000000000001 in ?? ()
#5  0x0000000000000001 in ?? ()
#6  0x0000000000000000 in ?? ()

Thread 4 (Thread 0x7f689bfff700 (LWP 53)):
#0  0x00007f68d40a2a35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f68d028f2fc in SystemNative_LowLevelMonitor_Wait (monitor=0x7f6894004fd0) at /__w/1/s/src/native/libs/System.Native/pal_threading.c:155
#2  0x0000000040f3791c in ?? ()
#3  0x00007f68d24f4da0 in ?? ()
#4  0xffffffffffffffff in ?? ()
#5  0x0000000000185b87 in ?? ()
#6  0x0000000000000001 in ?? ()
#7  0x00007f68d24f4dd0 in ?? ()
#8  0xffffffffffffffff in ?? ()
#9  0x0000000000185b87 in ?? ()
#10 0x00007f6894000fd0 in ?? ()
#11 0x00007f689bffe9f0 in ?? ()
#12 0x00007f689bffe900 in ?? ()
#13 0x00007f68940008c0 in ?? ()
#14 0x0000000040f3785c in ?? ()
#15 0x00007f689bffea70 in ?? ()
#16 0x0000000040f3781c in ?? ()
#17 0x00007f68d24f4dd0 in ?? ()
#18 0x0000000040f36ea8 in ?? ()
#19 0x00007f689bffea70 in ?? ()
#20 0x0000000000000000 in ?? ()

Thread 3 (Thread 0x7f689bbeb700 (LWP 58)):
#0  0x00007f68d40a2a35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f68d3051721 in mono_os_cond_wait (cond=0x7f688c003970, mutex=0x7f688c010e20) at /__w/1/s/src/mono/mono/mini/../../mono/utils/mono-os-mutex.h:219
#2  mono_coop_cond_wait (cond=0x7f688c003970, mutex=0x7f688c010e20) at /__w/1/s/src/mono/mono/mini/../../mono/utils/mono-coop-mutex.h:91
#3  mono_monitor_try_enter_inflated (obj=0x7f68d25a8ea0, ms=4294967295, allow_interruption=0, id=76) at /__w/1/s/src/mono/mono/metadata/monitor.c:875
#4  0x00007f68d3050235 in mono_monitor_try_enter_loop_if_interrupted (obj=0x7f68d25a8ea0, ms=4294967295, allow_interruption=<optimized out>, lockTaken=0x7f689bbea5d0 "", error=0x7f688c010e00) at /__w/1/s/src/mono/mono/metadata/monitor.c:1136
#5  0x0000000040fc8e48 in ?? ()
#6  0x0000000000000000 in ?? ()

Thread 2 (Thread 0x7f689b9ea700 (LWP 62)):
#0  0x00007f68d40a2de2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f68d3079f34 in mono_os_cond_timedwait (cond=0x7f689b9e9980, mutex=0x7f68c017f7b0, timeout_ms=20000) at /__w/1/s/src/mono/mono/utils/mono-os-mutex.c:75
#2  0x00007f68d307f6e4 in mono_coop_cond_timedwait (cond=0x7f689b9e9980, mutex=<optimized out>, timeout_ms=20000) at /__w/1/s/src/mono/mono/mini/../../mono/utils/mono-coop-mutex.h:103
#3  mono_lifo_semaphore_timed_wait (semaphore=0x7f68c017f7b0, timeout_ms=20000) at /__w/1/s/src/mono/mono/utils/lifo-semaphore.c:48
#4  0x0000000040f9c837 in ?? ()
#5  0x0000000000000002 in ?? ()
#6  0x0000000000000046 in ?? ()
#7  0x00007f68d25c4810 in ?? ()
#8  0x00007f68d25c4810 in ?? ()
#9  0x0000000000004e20 in ?? ()
#10 0x00007f6890000fd0 in ?? ()
#11 0x0000000000000046 in ?? ()
#12 0x00007f689b9e99f0 in ?? ()
#13 0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7f68d44c7780 (LWP 24)):
#0  0x00007f68d40a2a35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f68d028f2fc in SystemNative_LowLevelMonitor_Wait (monitor=0x555fd08efe80) at /__w/1/s/src/native/libs/System.Native/pal_threading.c:155
#2  0x0000000040f3791c in ?? ()
#3  0x00007f68bc9c3f10 in ?? ()
#4  0xffffffffffffffff in ?? ()
#5  0x00007f68d25b7530 in ?? ()
#6  0x0000000000000001 in ?? ()
#7  0x00007f68bc9c3f40 in ?? ()
#8  0xffffffffffffffff in ?? ()
#9  0x00007f68d25b7530 in ?? ()
#10 0x0000555fcfe60430 in ?? ()
#11 0x00007ffca22e9b60 in ?? ()
#12 0x00007ffca22e9a70 in ?? ()
#13 0x0000555fcfdeeb30 in ?? ()
#14 0x0000000040f3785c in ?? ()
#15 0x00007ffca22e9be0 in ?? ()
#16 0x0000000040f3781c in ?? ()
#17 0x00007f68bc9c3f40 in ?? ()
#18 0x0000000040f36ea8 in ?? ()
#19 0x00007ffca22e9be0 in ?? ()
#20 0x0000000000000000 in ?? ()
[Inferior 1 (process 24) detached]

=================================================================
	Basic Fault Address Reporting
=================================================================
Memory around native instruction pointer (0x7f68d32db387):0x7f68d32db377  48 63 d7 48 63 f6 48 63 f9 b8 ea 00 00 00 0f 05  Hc.Hc.Hc........
0x7f68d32db387  48 3d 00 f0 ff ff 77 1e f3 c3 0f 1f 80 00 00 00  H=....w.........
0x7f68d32db397  00 85 c9 7f db 89 c8 f7 d8 81 e1 ff ff ff 7f 0f  ................
0x7f68d32db3a7  44 c6 89 c1 eb ca 48 8b 15 9c 0a 39 00 f7 d8 64  D.....H....9...d

=================================================================
	Managed Stacktrace:
=================================================================
	  at <unknown> <0xffffffff>
	  at Fcntl:<GetFD>g____PInvoke|5_0 <0x000aa>
	  at Fcntl:GetFD <0x00033>
	  at System.Net.Sockets.SafeSocketHandle:TryUnblockSocket <0x00093>
	  at System.Net.Sockets.SafeSocketHandle:CloseAsIs <0x00203>
	  at System.Net.Sockets.Socket:Dispose <0x0070b>
	  at System.Net.Sockets.Socket:Dispose <0x000e1>
	  at System.Net.Sockets.Socket:Close <0x000db>
	  at System.Net.Sockets.TcpClient:Dispose <0x0014b>
	  at System.Net.Sockets.TcpClient:Dispose <0x00031>
	  at System.Net.Mail.SmtpConnection:ShutdownConnection <0x0023b>
	  at System.Net.Mail.SmtpConnection:Abort <0x0002f>
	  at System.Net.Mail.SmtpTransport:Abort <0x000ab>
	  at System.Net.Mail.SmtpClient:Abort <0x00033>
	  at System.Net.Mail.SmtpClient:TimeOutCallback <0x0004f>
	  at <>c:<.cctor>b__27_0 <0x0005e>
	  at System.Threading.ExecutionContext:RunInternal <0x000e2>
	  at System.Threading.TimerQueueTimer:CallCallback <0x000fb>
	  at System.Threading.TimerQueueTimer:Fire <0x0012f>
	  at System.Threading.TimerQueue:FireNextTimers <0x00367>
	  at System.Threading.TimerQueue:System.Threading.IThreadPoolWorkItem.Execute <0x0002b>
	  at System.Threading.ThreadPoolWorkQueue:Dispatch <0x0043b>
	  at WorkerThread:WorkerThreadStart <0x001cb>
	  at System.Threading.Thread:StartCallback <0x000f0>
	  at System.Object:runtime_invoke_void__this__ <0x00091>
=================================================================
./RunTests.sh: line 168:    24 Aborted                 (core dumped) "$RUNTIME_PATH/dotnet" exec --runtimeconfig System.Net.Mail.Functional.Tests.runtimeconfig.json --depsfile System.Net.Mail.Functional.Tests.deps.json xunit.console.dll System.Net.Mail.Functional.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing $RSP_FILE
/root/helix/work/workitem/e
----- end Tue Aug 8 18:42:29 UTC 2023 ----- exit code 134 ----------------------------------------------------------
exit code 134 means SIGABRT Abort. Managed or native assert, or runtime check such as heap corruption, caused call to abort(). Core dumped.
ulimit -c value: unlimited
[ 1452.802029] docker0: port 1(veth319b311) entered blocking state
[ 1452.802030] docker0: port 1(veth319b311) entered forwarding state
[ 1549.372157] docker0: port 1(veth319b311) entered disabled state
[ 1549.372185] veth3587465: renamed from eth0
[ 1549.490395] docker0: port 1(veth319b311) entered disabled state
[ 1549.491393] device veth319b311 left promiscuous mode
[ 1549.491407] docker0: port 1(veth319b311) entered disabled state
[ 1565.561385] docker0: port 1(vetha7a7ea0) entered blocking state
[ 1565.561387] docker0: port 1(vetha7a7ea0) entered disabled state
[ 1565.561498] device vetha7a7ea0 entered promiscuous mode
[ 1565.561595] IPv6: ADDRCONF(NETDEV_UP): vetha7a7ea0: link is not ready
[ 1565.872486] eth0: renamed from veth9a26e10
[ 1565.919146] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 1565.921079] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 1565.921090] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 1565.921108] IPv6: ADDRCONF(NETDEV_CHANGE): vetha7a7ea0: link becomes ready
[ 1565.921124] docker0: port 1(vetha7a7ea0) entered blocking state
[ 1565.921125] docker0: port 1(vetha7a7ea0) entered forwarding state
[ 1570.069243] docker0: port 1(vetha7a7ea0) entered disabled state
[ 1570.069297] veth9a26e10: renamed from eth0
[ 1570.160406] docker0: port 1(vetha7a7ea0) entered disabled state
[ 1570.162232] device vetha7a7ea0 left promiscuous mode
[ 1570.162243] docker0: port 1(vetha7a7ea0) entered disabled state
[ 1575.982520] docker0: port 1(veth5305d79) entered blocking state
[ 1575.982523] docker0: port 1(veth5305d79) entered disabled state
[ 1575.982569] device veth5305d79 entered promiscuous mode
[ 1575.986841] IPv6: ADDRCONF(NETDEV_UP): veth5305d79: link is not ready
[ 1576.280354] eth0: renamed from veth0d8b7bd
[ 1576.328095] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 1576.329880] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 1576.329891] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 1576.329907] IPv6: ADDRCONF(NETDEV_CHANGE): veth5305d79: link becomes ready
[ 1576.329922] docker0: port 1(veth5305d79) entered blocking state
[ 1576.329923] docker0: port 1(veth5305d79) entered forwarding state
[ 1585.750024] docker0: port 1(veth5305d79) entered disabled state
[ 1585.750049] veth0d8b7bd: renamed from eth0
[ 1585.835519] docker0: port 1(veth5305d79) entered disabled state
[ 1585.837601] device veth5305d79 left promiscuous mode
[ 1585.837608] docker0: port 1(veth5305d79) entered disabled state
[ 1593.863487] docker0: port 1(veth62bb91d) entered blocking state
[ 1593.863490] docker0: port 1(veth62bb91d) entered disabled state
[ 1593.863529] device veth62bb91d entered promiscuous mode
[ 1593.863679] IPv6: ADDRCONF(NETDEV_UP): veth62bb91d: link is not ready
[ 1594.140330] eth0: renamed from veth83150e2
[ 1594.187973] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 1594.189905] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 1594.189915] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 1594.189933] IPv6: ADDRCONF(NETDEV_CHANGE): veth62bb91d: link becomes ready
[ 1594.189949] docker0: port 1(veth62bb91d) entered blocking state
[ 1594.189950] docker0: port 1(veth62bb91d) entered forwarding state
Waiting a few seconds for any dump to be written..
cat /proc/sys/kernel/core_pattern: /home/helixbot/dotnetbuild/dumps/core.%u.%p
cat /proc/sys/kernel/core_uses_pid: 0
cat /proc/sys/kernel/coredump_filter:
Looking around for any Linux dump..
cat: /proc/sys/kernel/coredump_filter: No such file or directory
... found no dump in /root/helix/work/workitem/e
+ export _commandExitCode=134
+ _commandExitCode=134

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Net.Sockets Known Build Error Use this to report build issues in the .NET Helix tab os-linux Linux OS (any supported distro) os-mac-os-x macOS aka OSX
Projects
None yet
Development

Successfully merging a pull request may close this issue.