Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible bug: SqlDataReader.ReadAsync hangs and produces high garbage collection. #439

Closed
aliveli186 opened this issue Feb 21, 2020 · 15 comments

Comments

@aliveli186
Copy link

Tested on: System.Data.SqlClient versions "4.4.0", "4.8.0" and "4.8.1"

I'm trying to read from a table which has 8 records.
In ReadAsync it fetches the first 6 records quickly and hangs around 45 seconds to fetch the rest 2.
During that 45 seconds time produces very high garbage collection.

The synchronous version on the other hand fetches the records immediately and there is no visible garbage collection.

while (await reader.ReadAsync(cancellationToken).ConfigureAwait(false))
{
...
}

vs.

while (reader.Read())
{
...
}
@Wraith2
Copy link
Contributor

Wraith2 commented Feb 21, 2020

System.Data.SqlClient in this repository has been deprecated in favour of Microsoft.Data.SqlClient. I've contributed a number of performance improvements to that version of the library and if you can produce a simple test case against then 2.0.0preview that replicates your problem I can look into it to see what's going on.

@aliveli186
Copy link
Author

aliveli186 commented Feb 21, 2020

@Wraith2 Same issue persists with Microsoft.Data.SqlClient versions "1.1.1" and "2.0.0-preview1.20021.1".
Here is my code:


using System;
using System.Collections.Generic;
using System.Data;
using System.Threading;
using System.Threading.Tasks;

using Microsoft.Data.SqlClient;
// using System.Data.SqlClient;

namespace ConsoleApp40
{
    internal class Program
    {
        private static async Task Main(string[] args)
        {
            try
            {
                var connectionString = "Data Source=127.0.0.1;User ID=blabla;Password=blabla;Initial Catalog=BlaBla;Persist Security Info=True;";
                var expressions      = "WHERE 1=1 AND [RequestTimestamp] >= '2020-02-20 23:00:00.0000000' AND [RequestTimestamp] < '2020-02-21 23:00:00.0000000'";
                var result           = await ErrorLogSelectAsync(connectionString, expressions, CancellationToken.None).ConfigureAwait(false);

                Console.WriteLine(result.Count);
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex);
            }
        }

        public static async Task<List<int>> ErrorLogSelectAsync(string connectionString, string expressions, CancellationToken cancellationToken)
        {
            var parameters = new List<SqlParameter>
            {
                new SqlParameter("@Expressions", SqlDbType.NVarChar) {Value = expressions}
            };

            var i = 0;

            return await ExecuteStoredProcedureAsync(connectionString, "[Logging].[ErrorLogSelect]", null, parameters, reader =>
            {
                Console.WriteLine($"{DateTime.Now} {++i}: {reader.FieldCount}");

                return 0;
            }, cancellationToken).ConfigureAwait(false);
        }

        public static Task<List<TResult>> ExecuteStoredProcedureAsync<TResult>(string connectionString, string storedProcedureName, int? commandTimeout, IList<SqlParameter> parameters, Func<SqlDataReader, TResult> mapper, CancellationToken cancellationToken)
            => ExecuteConnectionAsync(connectionString, connection => ExecuteStoredProcedureAsync(connection, storedProcedureName, commandTimeout, parameters, mapper, cancellationToken), cancellationToken);

        public static async Task<TResult> ExecuteConnectionAsync<TResult>(string connectionString, Func<SqlConnection, Task<TResult>> processor, CancellationToken cancellationToken)
        {
            using (var connection = new SqlConnection(connectionString))
            {
                await connection.OpenAsync(cancellationToken).ConfigureAwait(false);
                return await processor(connection).ConfigureAwait(false);
            }
        }

        public static Task<List<TResult>> ExecuteStoredProcedureAsync<TResult>(SqlConnection connection, string storedProcedureName, int? commandTimeout, IList<SqlParameter> parameters, Func<SqlDataReader, TResult> mapper, CancellationToken cancellationToken)
            => ExecuteCommandAsync(connection, storedProcedureName, CommandType.StoredProcedure, commandTimeout, parameters, CommandBehavior.Default, mapper, cancellationToken);

        public static async Task<List<TResult>> ExecuteCommandAsync<TResult>(SqlConnection connection, string query, CommandType commandType, int? commandTimeout, IList<SqlParameter> parameters, CommandBehavior behavior, Func<SqlDataReader, TResult> mapper, CancellationToken cancellationToken)
        {
            using (var command = new SqlCommand(query, connection) {CommandType = commandType})
            {
                if (commandTimeout.HasValue)
                    command.CommandTimeout = commandTimeout.Value;

                if (parameters != null)
                {
                    foreach (var parameter in parameters)
                        command.Parameters.Add(parameter);
                }

                return await ExecuteCommandAsync(command, behavior, mapper, cancellationToken).ConfigureAwait(false);
            }
        }

        public static async Task<List<TResult>> ExecuteCommandAsync<TResult>(SqlCommand command, CommandBehavior behavior, Func<SqlDataReader, TResult> mapper, CancellationToken cancellationToken)
        {
            using (var reader = await command.ExecuteReaderAsync(behavior, cancellationToken).ConfigureAwait(false))
            {
                return await ReadToEndAsync(reader, mapper, cancellationToken).ConfigureAwait(false);
            }
        }

        public static async Task<List<TResult>> ReadToEndAsync<TResult>(SqlDataReader reader, Func<SqlDataReader, TResult> mapper, CancellationToken cancellationToken)
        {
            var values = new List<TResult>();

            while (await reader.ReadAsync(cancellationToken))
            {
                var value = mapper(reader);
                values.Add(value);
            }

            return values;

            /*
            return Task.Run(() =>
            {
                var values = new List<TResult>();

                while (reader.Read())
                {
                    var value = mapper(reader);
                    values.Add(value);
                }

                return values;
            }, cancellationToken);
            */
        }
    }
}

Console:
2/21/2020 11:33:43 PM 1: 28
2/21/2020 11:33:43 PM 2: 28
2/21/2020 11:33:43 PM 3: 28
2/21/2020 11:33:43 PM 4: 28
2/21/2020 11:33:43 PM 5: 28
2/21/2020 11:33:43 PM 6: 28
2/21/2020 11:34:03 PM 7: 28
2/21/2020 11:34:37 PM 8: 28
8

@Wraith2
Copy link
Contributor

Wraith2 commented Feb 21, 2020

@cheenamalhotra this might be worth moving

@cheenamalhotra cheenamalhotra transferred this issue from dotnet/runtime Feb 21, 2020
@cheenamalhotra
Copy link
Member

Thanks for the repro @aliveli186 .

The new home for SqlClient driver issues is "dotnet/sqlclient" repository, we will investigate and get back to you!

@Wraith2
Copy link
Contributor

Wraith2 commented Feb 21, 2020

@aliveli186 we'll need the sproc and any other database definitions it depends on.

@aliveli186
Copy link
Author

aliveli186 commented Feb 22, 2020

@Wraith2 The table it selects is a View which depends on few other tables, few joins. Can't provide the db schema now.

It's not a query problem because SQL Management Studio is working fine when executed with the same parameters.

CREATE PROCEDURE [Logging].[ErrorLogSelect] (@Expressions nvarchar(MAX) = NULL)
AS
BEGIN
    SET NOCOUNT ON;

    DECLARE @S nvarchar(MAX) = N'
    SELECT *
      FROM [Logging].[ErrorLog]';

    IF (@Expressions IS NOT NULL)
        SELECT @S = @S + N'
		   '        + @Expressions;

    PRINT @S;
    PRINT 'EXECUTING...';

    EXECUTE [sp_executesql] @S;

END;

@cheenamalhotra cheenamalhotra added this to Needs triage in SqlClient Triage Board via automation Feb 26, 2020
@cheenamalhotra cheenamalhotra moved this from Needs triage to Under Investigation in SqlClient Triage Board Feb 26, 2020
@tarurar
Copy link

tarurar commented Sep 25, 2020

@cheenamalhotra does the investigation clarified the issue? I am having same, await reader.ReadAsync() hangs forever.

@Wraith2
Copy link
Contributor

Wraith2 commented Sep 25, 2020

Investigation stalled because there isn't a complete repo, see "Can't provide the db schema now." above. If you have a self contained repo we can run then it can continue. Looking back at the report though It smells like long async strings to me and the answer is to not do that, fetch the string sync or use a textreader stream.

@tarurar
Copy link

tarurar commented Sep 25, 2020

@Wraith2 yep, fetching sync solves the issue. And the string was indeed huge, about 260 Mb.

@Wraith2
Copy link
Contributor

Wraith2 commented Sep 25, 2020

That's huge. Unless you actually need that as a string I strongly suggest using the GetTextReader overload in sequential mode, it'll save you a lot of GC work.

@roji
Copy link
Member

roji commented Sep 25, 2020

Another variant of #593, possibly a dup?

@Wraith2
Copy link
Contributor

Wraith2 commented Sep 25, 2020

Yup

@roji
Copy link
Member

roji commented Sep 25, 2020

@cheenamalhotra consider closing as a dup of #593

@cheenamalhotra
Copy link
Member

Sure! Thanks @roji and @Wraith2 :)
Closed as duplicate of #593

SqlClient Triage Board automation moved this from Under Investigation to Closed Sep 25, 2020
@Wraith2
Copy link
Contributor

Wraith2 commented Sep 25, 2020

I do sort of have a plan on how to deal with the long strings but it requires a lot of changes throughout the library.

The main problem is that each received packet starts the parse process again which we could avoid if we knew that we could just pick up from the end of the last packet's parse stall, but we can't do that because we can't tell a stall from a failure because all the TryRead* functions return Boolean where true is succeeded and failure could mean error or stall. If we replace those bools with OperationResult and flow that all the way through we could snapshot on stall and then resume from that snapshot at the next packet receipt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

5 participants