Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using Concat for model/seed data in migrations may truncate long strings #24112

Closed
ajcvickers opened this issue Feb 10, 2021 · 7 comments
Closed
Assignees
Labels
area-migrations-seeding closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported regression Servicing-approved type-bug
Milestone

Comments

@ajcvickers
Copy link
Member

Originally reported as #23634 (comment) by @saliksaly.

Using CONCAT for strings is not so good due to string truncation to 4000 chars in SQL server :-(
Now the migrations run without error (Microsoft.Data.SqlClient.SqlException (0x80131904): The concat function requires 2 to 254 arguments.) but my seeded strings are now truncated in the database. Silently!

Here are some resources for this SQL server behavior:
https://stackoverflow.com/questions/4547821/using-varcharmax-with-string-concatenation-in-sql-server-2005
https://stackoverflow.com/questions/4833549/nvarcharmax-still-being-truncated
https://sqlundercover.com/2017/09/27/concatenation-truncation-are-your-strings-being-truncated/

Is there any workaround for this? We have had CSHTML templates in seed data and now it would not be posible in new version of .net?

@ajcvickers
Copy link
Member Author

/cc @bricelam

@smitpatel
Copy link
Member

My preliminary experimentation says things are not as straight-forward as in posts.
Results of CONCAT is the highest sized nvarchar from arguments.
Results of + is highest sized nvarchar from the 2 arguments according to type-precedence rules due to implicit casting.

@saliksaly
Copy link

My preliminary experimentation says things are not as straight-forward as in posts.
Results of CONCAT is the highest sized nvarchar from arguments.
Results of + is highest sized nvarchar from the 2 arguments according to type-precedence rules due to implicit casting.

Here is a nice explanation of the Data type precedence used:
https://stackoverflow.com/questions/1371383/for-nvarcharmax-i-am-only-getting-4000-characters-in-tsql/1371584#1371584

There is a workaround:
concat with an additional nvarchar(max) member (can be empty)

Let's try:

This returns only 4000 chars:
SELECT CONCAT(REPLICATE(N'A',4000), REPLICATE(N'B',4000), REPLICATE(N'C',4000))

This returns everything:
SELECT CONCAT(CAST('' AS NVARCHAR(MAX)), REPLICATE(N'A',4000), REPLICATE(N'B',4000), REPLICATE(N'C',4000))

(inspired here: https://sqlundercover.com/2017/09/27/concatenation-truncation-are-your-strings-being-truncated/)

@smitpatel
Copy link
Member

@saliksaly - I have gone through the official T-Sql documentation. My point is that regardless of issue happening due to data type precedence, same data type is applied for Concat function and + operator. So using of Concat is not in itself an issue.

@smitpatel
Copy link
Member

if (!omitVariableDeclarations)
{
builder.Append("DECLARE @description AS sql_variant")
.AppendLine(Dependencies.SqlGenerationHelper.StatementTerminator);
}
builder.Append("SET @description = ")
.Append(Literal(description))
.AppendLine(Dependencies.SqlGenerationHelper.StatementTerminator);

Comments declare variable of type sql_variant but still uses string type mapping to print out. sql_variant is incompatible with (n)varchar(max) so such casting generate invalid SQL. Further the limitation on sp_addextendedproperty value (which is sql_variant type) is 7500 bytes at max. So comments which crosses 4000 characters and contains line breaks will be truncated (probably at 3750 character mark) and it cannot be fixed at all.

In order to fix this issue, we need to change the migration code to stop using string type mapping to print sql_variant literal else fix for this issue is generating invalid SQL for a working case of comments.

cc: @bricelam

@smitpatel smitpatel removed this from the 5.0.5 milestone Feb 22, 2021
@ajcvickers ajcvickers added this to the 5.0.5 milestone Feb 23, 2021
@ajcvickers ajcvickers removed this from the 5.0.5 milestone Apr 28, 2021
@ajcvickers
Copy link
Member Author

/cc @bricelam

@AndriySvyryd AndriySvyryd added this to the 5.0.x milestone May 18, 2021
smitpatel added a commit that referenced this issue May 20, 2021
@smitpatel smitpatel added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label May 20, 2021
smitpatel added a commit that referenced this issue May 21, 2021
smitpatel added a commit that referenced this issue May 25, 2021
@ajcvickers ajcvickers modified the milestones: 5.0.x, 5.0.8 Jul 11, 2021
@btastic
Copy link

btastic commented Jan 18, 2022

Hello @ajcvickers, sorry to revive this "old" issue.

We are facing some troubling query right now regarding truncated concat'ed strings, and we would like to show the code and query in question. However, since this is company and legal related, I can't just paste the code on here.

Is there any channel where I can submit this, without having to post company related code?

Edit: I decided to have a look at it myself, and found the issue. I created an issue for it: #27206

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-migrations-seeding closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported regression Servicing-approved type-bug
Projects
None yet
Development

No branches or pull requests

5 participants