Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 688 AsyncHelper.WaitForCompletion leaks unobserved exceptions #692

Merged
merged 13 commits into from Oct 22, 2020
Merged
Expand Up @@ -17,6 +17,8 @@
using System.Transactions;
using Microsoft.Data.Common;

[assembly: InternalsVisibleTo("FunctionalTests")]
cheenamalhotra marked this conversation as resolved.
Show resolved Hide resolved

namespace Microsoft.Data.SqlClient
{
internal static class AsyncHelper
Expand Down Expand Up @@ -204,6 +206,7 @@ internal static void WaitForCompletion(Task task, int timeout, Action onTimeout
}
if (!task.IsCompleted)
{
task.ContinueWith(t => { var ignored = t.Exception; }); //Ensure the task does not leave an unobserved exception
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

N.B. I repeated the experiment for the new lambda to confirm it's still not creating lots of instances of the closure

if (onTimeout != null)
{
onTimeout();
Expand Down
Expand Up @@ -17,6 +17,8 @@
using System.Threading.Tasks;
using SysTx = System.Transactions;

[assembly: InternalsVisibleTo("FunctionalTests")]

namespace Microsoft.Data.SqlClient
{
using Microsoft.Data.Common;
Expand Down Expand Up @@ -189,6 +191,7 @@ internal static void WaitForCompletion(Task task, int timeout, Action onTimeout
}
if (!task.IsCompleted)
{
task.ContinueWith(t => { var ignored = t.Exception; }); //Ensure the task does not leave an unobserved exception
if (onTimeout != null)
{
onTimeout();
Expand Down
Expand Up @@ -13,6 +13,11 @@ namespace Microsoft.Data.SqlClient.ManualTesting.Tests
{
public static class BaseProviderAsyncTest
{
private static void AssertTaskFaults(Task t)
{
Assert.ThrowsAny<Exception>(() => t.Wait(TimeSpan.FromMilliseconds(1)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Could you confirm if if providing timeout here is intentional?
Although it's a test, but it changes existing behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it's intentional - note that t.Wait returns false in the case of timeout (doesn't throw a TimeoutException) so if the task under test times out the test will fail, as no exception will be thrown. I thought this was preferable to the previous behaviour where if the Task hung the test would hang indefinitely (or until killed by the framework)

In this particular case all the Tasks under test are already completed ones generated by Task.FromResult, so we can get away with a 1ms timeout and know nothing will timeout.

However this is somewhat a matter of taste, and I don't know the details of whether your test suite is set up to handle timing out tests well. So if you'd rather have a longer timeout or no timeout I'm happy to change it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original design for this is to verify that the unhandled exception can be propagated through tasks. We don't want the tests to fail due to this 1 ms timeout although it should not happen according to your explanation. I personally would prefer not setting timeout here.

}

[Fact]
public static void TestDbConnection()
{
Expand All @@ -37,8 +42,8 @@ public static void TestDbConnection()
{
Fail = true
};
connectionFail.OpenAsync().ContinueWith((t) => { }, TaskContinuationOptions.OnlyOnFaulted).Wait();
connectionFail.OpenAsync(source.Token).ContinueWith((t) => { }, TaskContinuationOptions.OnlyOnFaulted).Wait();
AssertTaskFaults(connectionFail.OpenAsync());
AssertTaskFaults(connectionFail.OpenAsync(source.Token));

// Verify base implementation does not call Open when passed an already cancelled cancellation token
source.Cancel();
Expand Down Expand Up @@ -90,14 +95,14 @@ public static void TestDbCommand()
{
Fail = true
};
commandFail.ExecuteNonQueryAsync().ContinueWith((t) => { }, TaskContinuationOptions.OnlyOnFaulted).Wait();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting TaskContinuationOptions.OnlyOnFaulted and not checking the exception leads to the exception being unobserved:

https://docs.microsoft.com/en-us/dotnet/api/system.threading.tasks.taskcontinuationoptions?view=netcore-3.1

"If you do not access the Exception property, the exception is unhandled."

I also was not a fan of the fact that if the command did not fail, the test would hand indefinitely, so I swapped to the current approach

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think I have found the source of these exceptions, and should be fixed in latest commit

commandFail.ExecuteNonQueryAsync(source.Token).ContinueWith((t) => { }, TaskContinuationOptions.OnlyOnFaulted).Wait();
commandFail.ExecuteReaderAsync().ContinueWith((t) => { }, TaskContinuationOptions.OnlyOnFaulted).Wait();
commandFail.ExecuteReaderAsync(CommandBehavior.SequentialAccess).ContinueWith((t) => { }, TaskContinuationOptions.OnlyOnFaulted).Wait();
commandFail.ExecuteReaderAsync(source.Token).ContinueWith((t) => { }, TaskContinuationOptions.OnlyOnFaulted).Wait();
commandFail.ExecuteReaderAsync(CommandBehavior.SequentialAccess, source.Token).ContinueWith((t) => { }, TaskContinuationOptions.OnlyOnFaulted).Wait();
commandFail.ExecuteScalarAsync().ContinueWith((t) => { }, TaskContinuationOptions.OnlyOnFaulted).Wait();
commandFail.ExecuteScalarAsync(source.Token).ContinueWith((t) => { }, TaskContinuationOptions.OnlyOnFaulted).Wait();
AssertTaskFaults(commandFail.ExecuteNonQueryAsync());
AssertTaskFaults(commandFail.ExecuteNonQueryAsync(source.Token));
AssertTaskFaults(commandFail.ExecuteReaderAsync());
AssertTaskFaults(commandFail.ExecuteReaderAsync(CommandBehavior.SequentialAccess));
AssertTaskFaults(commandFail.ExecuteReaderAsync(source.Token));
AssertTaskFaults(commandFail.ExecuteReaderAsync(CommandBehavior.SequentialAccess, source.Token));
AssertTaskFaults(commandFail.ExecuteScalarAsync());
AssertTaskFaults(commandFail.ExecuteScalarAsync(source.Token));

// Verify base implementation does not call Open when passed an already cancelled cancellation token
source.Cancel();
Expand All @@ -116,17 +121,17 @@ public static void TestDbCommand()
source = new CancellationTokenSource();
Task.Factory.StartNew(() => { command.WaitForWaitingForCancel(); source.Cancel(); });
Task result = command.ExecuteNonQueryAsync(source.Token);
Assert.True(result.IsFaulted, "Task result should be faulted");
Assert.True(result.Exception != null, "Task result should be faulted");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking IsFaulted does not count as observing an exception, accessing the Exception property does


source = new CancellationTokenSource();
Task.Factory.StartNew(() => { command.WaitForWaitingForCancel(); source.Cancel(); });
result = command.ExecuteReaderAsync(source.Token);
Assert.True(result.IsFaulted, "Task result should be faulted");
Assert.True(result.Exception != null, "Task result should be faulted");

source = new CancellationTokenSource();
Task.Factory.StartNew(() => { command.WaitForWaitingForCancel(); source.Cancel(); });
result = command.ExecuteScalarAsync(source.Token);
Assert.True(result.IsFaulted, "Task result should be faulted");
Assert.True(result.Exception != null, "Task result should be faulted");
}

[Fact]
Expand Down Expand Up @@ -155,9 +160,9 @@ public static void TestDbDataReader()

GetFieldValueAsync<object>(reader, 2, DBNull.Value);
GetFieldValueAsync<DBNull>(reader, 2, DBNull.Value);
reader.GetFieldValueAsync<int?>(2).ContinueWith((t) => { }, TaskContinuationOptions.OnlyOnFaulted).Wait();
reader.GetFieldValueAsync<string>(2).ContinueWith((t) => { }, TaskContinuationOptions.OnlyOnFaulted).Wait();
reader.GetFieldValueAsync<bool>(2).ContinueWith((t) => { }, TaskContinuationOptions.OnlyOnFaulted).Wait();
AssertTaskFaults(reader.GetFieldValueAsync<int?>(2));
AssertTaskFaults(reader.GetFieldValueAsync<string>(2));
AssertTaskFaults(reader.GetFieldValueAsync<bool>(2));
AssertEqualsWithDescription("GetValue", reader.LastCommand, "Last command was not as expected");

result = reader.ReadAsync();
Expand All @@ -174,12 +179,12 @@ public static void TestDbDataReader()
Assert.False(result.Result, "Should NOT have received a Result from NextResultAsync");

MockDataReader readerFail = new MockDataReader { Results = query.GetEnumerator(), Fail = true };
readerFail.ReadAsync().ContinueWith((t) => { }, TaskContinuationOptions.OnlyOnFaulted).Wait();
readerFail.ReadAsync(source.Token).ContinueWith((t) => { }, TaskContinuationOptions.OnlyOnFaulted).Wait();
readerFail.NextResultAsync().ContinueWith((t) => { }, TaskContinuationOptions.OnlyOnFaulted).Wait();
readerFail.NextResultAsync(source.Token).ContinueWith((t) => { }, TaskContinuationOptions.OnlyOnFaulted).Wait();
readerFail.GetFieldValueAsync<object>(0).ContinueWith((t) => { }, TaskContinuationOptions.OnlyOnFaulted).Wait();
readerFail.GetFieldValueAsync<object>(0, source.Token).ContinueWith((t) => { }, TaskContinuationOptions.OnlyOnFaulted).Wait();
AssertTaskFaults(readerFail.ReadAsync());
AssertTaskFaults(readerFail.ReadAsync(source.Token));
AssertTaskFaults(readerFail.NextResultAsync());
AssertTaskFaults(readerFail.NextResultAsync(source.Token));
AssertTaskFaults(readerFail.GetFieldValueAsync<object>(0));
AssertTaskFaults(readerFail.GetFieldValueAsync<object>(0, source.Token));

source.Cancel();
reader.LastCommand = "Nothing";
Expand Down
Expand Up @@ -42,6 +42,7 @@
<Compile Include="SqlCredentialTest.cs" />
<Compile Include="SqlDataRecordTest.cs" />
<Compile Include="SqlExceptionTest.cs" />
<Compile Include="SqlHelperTest.cs" />
<Compile Include="SqlParameterTest.cs" />
<Compile Include="SqlClientFactoryTest.cs" />
<Compile Include="SqlErrorCollectionTest.cs" />
Expand Down
@@ -0,0 +1,55 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.
// See the LICENSE file in the project root for more information.

using System;
using System.Threading;
using System.Threading.Tasks;
using Xunit;

namespace Microsoft.Data.SqlClient.Tests
{
public class SqlHelperTest
{
private void TimeOutATask()
{
TaskCompletionSource<bool> tcs = new TaskCompletionSource<bool>();
AsyncHelper.WaitForCompletion(tcs.Task, 1); //Will time out as task uncompleted
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love that this line causes the unit test to take a second to run - but that's the minimum timeout on WaitForCompletion

tcs.SetException(new TimeoutException("Dummy timeout exception")); //Our task now completes with an error
}

private Exception UnwrapException(Exception e)
{
return e?.InnerException != null ? UnwrapException(e.InnerException) : e;
}

[Fact]
public void WaitForCompletion_DoesNotCreateUnobservedException()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test does need a bit of rework to handle randomness. It fails randomly in pipelines.

You'll notice if you run this test in loop, the second round does not pass ever. Any thoughts why can't we run this test in second round? It maybe a hint towards random errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was just looking into this - it seems the fix I had written to WaitForCompletion wasn't actually a fix. I needed to make the continuation actually observe the exception, the behaviour I had lead to a race condition, where if the continuation hadn't been executed at the point we called GC.Collect, then there was still a reference to the original task and it wasn't Collected. Creating a false positive for me having fixed it.

I proved this by modifying WaitForCompletion to return the continuation, and waited for it to complete, at which point the error happened deterministically. Making the continuation observe the exception fixed the error. My bad, I misinterpreted the text here: https://docs.microsoft.com/en-us/dotnet/api/system.threading.tasks.taskcontinuationoptions?view=netcore-3.1 saying "If you do not access the Exception property, the exception is unhandled." To only apply to OnlyOnFaulted, when it applies to all TaskContinuationOptions.

I can't see a way to make the failure deterministic without passing some sort of signal back from the continuation, which doesn't fit the signature and isn't desirable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is just a test so make it as messy as it needs to be to get a deterministic answer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does pass deterministically now with the fix in place - what I couldn't get deterministic was the failure without the fix in place.

{
var unobservedExceptionHappenedEvent = new AutoResetEvent(false);
Exception unhandledException = null;
void handleUnobservedException(object o, UnobservedTaskExceptionEventArgs a)
{ unhandledException = a.Exception; unobservedExceptionHappenedEvent.Set(); }

TaskScheduler.UnobservedTaskException += handleUnobservedException;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't much like messing with global state from a unit test, but I'm not sure of what other way to test this. I had a look at TaskExceptionHolder (which is what is responsible for raising the event when finalized) but it looks anything along those lines would be messing with the internals of the Task system too much.


try
{
TimeOutATask(); //Create the task in another function so the task has no reference remaining
GC.Collect(); //Force collection of unobserved task
GC.WaitForPendingFinalizers();

bool unobservedExceptionHappend = unobservedExceptionHappenedEvent.WaitOne(1);
if (unobservedExceptionHappend) //Save doing string interpolation in the happy case
{
var e = UnwrapException(unhandledException);
Assert.False(true, $"Did not expect an unobserved exception, but found a {e?.GetType()} with message \"{e?.Message}\"");
}
}
finally
{
TaskScheduler.UnobservedTaskException -= handleUnobservedException;
}
}
}
}