Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack Overflow exception with SslStream and NetCoreApp mix of 2.1 and 3.x #38185

Closed
DavoudEshtehari opened this issue Jun 20, 2020 · 15 comments
Closed
Assignees
Labels
area-System.Net.Security bug os-linux Linux OS (any supported distro)
Milestone

Comments

@DavoudEshtehari
Copy link
Member

Description

Hi dotnet/runtime team,

I found an issue in Ubuntu 16.04 when I was using the SslStream. At first, I tried to reproduce it in a sample application in Ubuntu, and at last, I am able to reproduce it in Windows. I don't know if there is a similar issue with other types!

The issue happens when you try to use an assembly with NetCore 2.1 in another project with NetCore 3.x (I have not checked the NetCore 5.0 preview).

  • You can find the sample application here.

  • Here are the sample outputs:

image

  • You will get the same exception if you use nercoreapp3.0 in the 'Demo' project.

If you change the TargetFramework of the project ClassLibraryA to nercoreapp3.1 from netcoreapp2.1, or change TargetFramework of the project Demo to nercoreapp2.1 it will work without exception.

image

@ghost
Copy link

ghost commented Jun 22, 2020

Tagging subscribers to this area: @dotnet/ncl
Notify danmosemsft if you want to be subscribed.

@janvorli janvorli added os-linux Linux OS (any supported distro) and removed area-ExceptionHandling-coreclr untriaged New issue has not been triaged by the area owner labels Jun 22, 2020
@wfurt
Copy link
Member

wfurt commented Jun 24, 2020

I may misunderstand the use case but IO don't think it is safe to mix random assemblies from .NET versions.
SslStream in particular does depend on matching native part.
cc: @ericstj as he may have more insight to deployments with mixed versions.

@wfurt
Copy link
Member

wfurt commented Jul 2, 2020

I took closer look and here is what is going on. 2.1 does not have correct ref assemblies. That was fixed by @stephentoub in dotnet/corefx#29658 but it was not ported back to 2.1. With that base does not resolve properly and that creates code loop and eventually fills up stack.

in windbg

(1fc4.2ef8): Stack overflow - code c00000fd (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
coreclr!StreamNative::HasOverriddenBeginEndWrite+0x73:
00007ffc`ccedd283 e8ac050000      call    coreclr!HasOverriddenStreamMethod (00007ffc`ccedd834)
0:000> k
 # Child-SP          RetAddr           Call Site
00 000000ce`66c25ff0 00007ffc`cca4b9b3 coreclr!StreamNative::HasOverriddenBeginEndWrite+0x73 [f:\workspace\_work\1\s\src\vm\comutilnative.cpp @ 2451] 
01 000000ce`66c26160 00007ffc`cca4b726 System_Private_CoreLib!System.IO.Stream.BeginEndWriteAsync(Byte[], Int32, Int32)$##60043FD+0x33
02 000000ce`66c26200 00007ffc`6d451442 System_Private_CoreLib!System.IO.Stream.WriteAsync(Byte[], Int32, Int32, System.Threading.CancellationToken)$##60043FA+0x26
03 000000ce`66c26230 00007ffc`ceccbaeb 0x00007ffc`6d451442
04 000000ce`66c26280 00007ffc`cca4d96c System_Net_Security!System.Net.Security.SslStream.BeginWrite(Byte[], Int32, Int32, System.AsyncCallback, System.Object)$##6000283+0x4b
05 000000ce`66c262d0 00007ffc`ccae833e System_Private_CoreLib!System.IO.Stream+<>c.<BeginEndWriteAsync>b__61_0(System.IO.Stream, ReadWriteParameters, System.AsyncCallback, System.Object)$##6004463+0x2c
06 000000ce`66c26310 00007ffc`cca4ba9e System_Private_CoreLib!System.Threading.Tasks.TaskFactory`1[System.Threading.Tasks.VoidTaskResult].FromAsyncTrim[System.__Canon,System.IO.Stream+ReadWriteParameters](System.__Canon, ReadWriteParameters, System.Func`5<System.__Canon,ReadWriteParameters,System.AsyncCallback,System.Object,System.IAsyncResult>, System.Func`3<System.__Canon,System.IAsyncResult,System.Threading.Tasks.VoidTaskResult>)$##60020DA+0x7e
07 000000ce`66c26370 00007ffc`cca4b726 System_Private_CoreLib!System.IO.Stream.BeginEndWriteAsync(Byte[], Int32, Int32)$##60043FD+0x11e
08 000000ce`66c26410 00007ffc`6d451442 System_Private_CoreLib!System.IO.Stream.WriteAsync(Byte[], Int32, Int32, System.Threading.CancellationToken)$##60043FA+0x26
09 000000ce`66c26440 00007ffc`ceccbaeb 0x00007ffc`6d451442
0a 000000ce`66c26490 00007ffc`cca4d96c System_Net_Security!System.Net.Security.SslStream.BeginWrite(Byte[], Int32, Int32, System.AsyncCallback, System.Object)$##6000283+0x4b
0b 000000ce`66c264e0 00007ffc`ccae833e System_Private_CoreLib!System.IO.Stream+<>c.<BeginEndWriteAsync>b__61_0(System.IO.Stream, ReadWriteParameters, System.AsyncCallback, System.Object)$##6004463+0x2c
0c 000000ce`66c26520 00007ffc`cca4ba9e System_Private_CoreLib!System.Threading.Tasks.TaskFactory`1[System.Threading.Tasks.VoidTaskResult].FromAsyncTrim[System.__Canon,System.IO.Stream+ReadWriteParameters](System.__Canon, ReadWriteParameters, System.Func`5<System.__Canon,ReadWriteParameters,System.AsyncCallback,System.Object,System.IAsyncResult>, System.Func`3<System.__Canon,System.IAsyncResult,System.Threading.Tasks.VoidTaskResult>)$##60020DA+0x7e
0d 000000ce`66c26580 00007ffc`cca4b726 System_Private_CoreLib!System.IO.Stream.BeginEndWriteAsync(Byte[], Int32, Int32)$##60043FD+0x11e
0e 000000ce`66c26620 00007ffc`6d451442 System_Private_CoreLib!System.IO.Stream.WriteAsync(Byte[], Int32, Int32, System.Threading.CancellationToken)$##60043FA+0x26
0f 000000ce`66c26650 00007ffc`ceccbaeb 0x00007ffc`6d451442
10 000000ce`66c266a0 00007ffc`cca4d96c System_Net_Security!System.Net.Security.SslStream.BeginWrite(Byte[], Int32, Int32, System.AsyncCallback, System.Object)$##6000283+0x4b
11 000000ce`66c266f0 00007ffc`ccae833e System_Private_CoreLib!System.IO.Stream+<>c.<BeginEndWriteAsync>b__61_0(System.IO.Stream, ReadWriteParameters, System.AsyncCallback, System.Object)$##6004463+0x2c
12 000000ce`66c26730 00007ffc`cca4ba9e System_Private_CoreLib!System.Threading.Tasks.TaskFactory`1[System.Threading.Tasks.VoidTaskResult].FromAsyncTrim[System.__Canon,System.IO.Stream+ReadWriteParameters](System.__Canon, ReadWriteParameters, System.Func`5<System.__Canon,ReadWriteParameters,System.AsyncCallback,System.Object,System.IAsyncResult>, System.Func`3<System.__Canon,System.IAsyncResult,System.Threading.Tasks.VoidTaskResult>)$##60020DA+0x7e
13 000000ce`66c26790 00007ffc`cca4b726 System_Private_CoreLib!System.IO.Stream.BeginEndWriteAsync(Byte[], Int32, Int32)$##60043FD+0x11e
14 000000ce`66c26830 00007ffc`6d451442 System_Private_CoreLib!System.IO.Stream.WriteAsync(Byte[], Int32, Int32, System.Threading.CancellationToken)$##60043FA+0x26
15 000000ce`66c26860 00007ffc`ceccbaeb 0x00007ffc`6d451442

I think best option here is targeting 3.1 (if possible) for the wrapper or avoiding the missing functions.
cc: @stephentoub in case there is some other recommendation.

@wfurt wfurt changed the title Stack Overflow exception with SslStream and NetCoreApp 2.1 & 3.x Stack Overflow exception with SslStream and NetCoreApp mix of 2.1 and 3.x Jul 2, 2020
@DavoudEshtehari
Copy link
Member Author

Nice catch @wfurt
If I was the end-user maybe I could use the 3.1 version but I am working on the SqlClient driver and this will be a probable issue for the end-users.
In addition, I haven't thought about the new solution if I avoid the missing function, but by considering the current driver's design I know it will be complicated. So, I'd rather current solution.

@cheenamalhotra
Copy link
Member

@wfurt

I may misunderstand the use case but IO don't think it is safe to mix random assemblies from .NET versions.

Isn't .NET Core SDK supposed to be backwards compatible?
I mean, if a product ships only a netcoreapp2.1 DLL, it's not going to be compatible with "netcoreapp3.1" and "net5.0"?

@wfurt
Copy link
Member

wfurt commented Jul 6, 2020

It would. But since 2.1 is missing WriteAsync in reference assembly, you will not be able to use it.
This is bug specific to streams and way how SQL client tries to use it,
@ericstj may know if this is something we may be able to service in 2.1.x.

@ericstj
Copy link
Member

ericstj commented Jul 14, 2020

You should be able to service it in 2.1. Normally we don't touch reference assemblies in servicing, but since this is a case of a compatible change that the implementation already supports I think it is OK. 2.1 ships the reference assemblies with every servicing event via the Microsoft.NETCore.App package, so I think the change here is to simply make the fix to the reference assembly source. Of course for SqlClient to be able to use such a fix they'd need to rebuild using that updated package. SqlClient could also try to build against some private set of reference assemblies if you wanted, since this change doesn't actually ship inside the component that could be done to help unblock things quicker.

@karelz karelz added this to the Future milestone Jul 30, 2020
@karelz karelz added the bug label Jul 30, 2020
@karelz karelz modified the milestones: Future, 2.1.x Jul 30, 2020
@karelz
Copy link
Member

karelz commented Jul 30, 2020

@DavoudEshtehari is it blocking you? Do you have a business case for us to fix it in 2.1 servicing?
I assume there are no workarounds like using reflection, or something like that ... right?

@DavoudEshtehari
Copy link
Member Author

Hi @karelz

@DavoudEshtehari Davoud Eshtehari (Simba Technologies Inc) Vendor is it blocking you?

We have several issues related to the PR #579 that are blocked by this issue.

Do you have a business case for us to fix it in 2.1 servicing?

If I understood correctly, besides the sample app, you can try the PR I mentioned.

I assume there are no workarounds like using reflection, or something like that ... right?

You're right. I have no idea for a workaround. If you have any proven idea please, let me know.

@cheenamalhotra
Copy link
Member

cheenamalhotra commented Jul 30, 2020

@karelz

Below issues are waiting for PR dotnet/SqlClient#579:

As for workaround for driver issue, we have below:

  • Enabling MARS connection property.
  • Use Sync operations.

Although we do ship netcoreapp3.1 DLL with NuGet - that should not surface this issue, however, this is a bleak possible case where if users end up using netcoreapp2.1 DLL when targeting anything above it.

@karelz
Copy link
Member

karelz commented Sep 21, 2020

@CarnaViire will start the port. @cheenamalhotra will you be able to validate the change prior to checkin?

@wfurt
Copy link
Member

wfurt commented Sep 21, 2020

This was quite visible with the sample app @karelz. We should decide if we take while change or just the SslStream part (to limit possible impact)

@cheenamalhotra
Copy link
Member

Sure, let us know when changes are ready to be tested.

@karelz
Copy link
Member

karelz commented Sep 22, 2020

Sample app is good, we should try end-to-end validation as well though

Triage: We should take just the relevant APIs needed here, not the entire PR.

@CarnaViire
Copy link
Member

As per offline communication with SqlClient team,

We discovered that the SqlClient nuget package provides both 2.1 and 3.1 versions of the library. It means that for 3.1+ applications, 3.1 version will be resolved, not 2.1, so the repro with projects provided in the issue does not seem to represent SqlClient scenarios.

Given that there is no customer hitting it and given that it seems SqlClient users are not going to hit the problem on supported versions (2.1 or 3.1+), we decided that we should wait with servicing until we have real customer reports of hitting the problem.

@ghost ghost locked as resolved and limited conversation to collaborators Dec 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Net.Security bug os-linux Linux OS (any supported distro)
Projects
None yet
Development

No branches or pull requests

8 participants