SQL Server String Concatenation and Truncation
We use a lot of dynamic SQL in the SQL Server database for our application (it seemed like a good idea at the time), and for some time we’ve noticed some strange issues with strings getting truncated when concatenating large dynamic queries. We’ve implemented a few different hacks over the years to work around it, but I had never really dug into what exactly was going on, until yesterday. As it turns out, there is a plethora of information about SQL string concatenation and truncation online, but the really useful stuff was all spread out and nestled in obscure forum posts, and I couldn’t find any clear recommendations on best practice to avoid all the different scenarios where truncation can occur. So I decided to write my own summary of the issues, and some simple ways to avoid them.
Best Practice for String Concatenation
1. If possible, just avoid concatenating really long strings. Generate one really long literal string with patterns for replacement, and replace each pattern using replace().
2. If you need to concatenate multiple long string variables, make them all the same type and max length (i.e. all varchar(max) or nvarchar(max)).
3. If you need to concatenate long string literals into a nvarchar variable, use the N prefix to make them nvarchar literals to prevent them being truncated to 4000 characters.
4. If you just want a quick & easy fix, you can just start the concatenation with a nvarchar(max) to ensure all subsequent concatenations convert to nvarchar(max), e.g: cast ('' as nvarchar(max)) + ...
Note that this will still cause some truncation if you have fixed length or literal varchars longer than 4000 characters (see notes below).
When Could Truncation Occur
When you concatenate two fixed length strings, the result is a fixed length string based on the sum of the two lengths. The longest fixed length varchar is 8000 characters and the longest fixed length nvarchar is 4000 characters, so anything longer will be truncated.
DECLARE @a NVARCHAR(3000) = REPLICATE ('X', 3000)
DECLARE @b NVARCHAR(3000) = REPLICATE ('X', 3000)
SELECT @a + @b
-- Output Length: 4000 Type: NVARCHAR(4000)
When we concatenate a nvarchar(max) with a fixed length varchar, the result will have a max length, but the text from the varchar expression will actually be truncated to 4000 characters. This seems crazy, but occurs because the first step of the concatenation is to convert the varchar to nvarchar, and the longest fixed length nvarchar is 4000 characters.
DECLARE @a NVARCHAR(MAX) = ''
DECLARE @b VARCHAR(5000) = REPLICATE ('X', 5000)
SELECT @a + @b
-- Output Length: 4000 Type: NVARCHAR(MAX)
When we concatenate a nvarchar(max) with a varchar string literal shorter than 8000 characters, we get the same truncation behaviour as above. If however the varchar string is longer than 8000 characters, no truncation occurs. This is because a varchar string literal shorter than 8000 characters is typed as a fixed length varchar, but anything longer is typed as a varchar(max). We can also bypass the truncation by writing our string literal with the N prefix which types it as an nvarchar.
DECLARE @a NVARCHAR(MAX) = ''
SELECT @a + '...imagine 5,000 characters here...'
-- Output Length: 4000 Type: NVARCHAR(MAX)
SELECT @a + '...imagine 10,000 characters here...'
-- Output Length: 10000 Type: NVARCHAR(MAX)
SELECT @a + N'...imagine 5,000 characters here...'
-- Output Length: 5000 Type: NVARCHAR(MAX)
Truncation never occurs when concatenating any length varchar with a varchar(max), or any length nvarchar with a nvarchar(max)
DECLARE @a NVARCHAR(4000) = REPLICATE ('X', 4000)
DECLARE @b NVARCHAR(MAX) = REPLICATE ('X', 4000)
SELECT @a + @b
-- Output Length: 8000 Type: NVARCHAR(MAX)
For multiple concatenations (e.g. @a + @b + @c), concatenation is performed in the usual order of operations (based on parenthesis or left to right), and the truncations noted above apply for each individual concatenation. One way to minimize truncation if you can’t follow the best practice is therefore to ensure that the first type in your concatenation is a varchar(max) or nvarchar(max), though if you use nvarchar(max) you will still get truncation with fixed length and literal varchars longer than 4000 characters.
DECLARE @a NVARCHAR(4000) = REPLICATE ('X', 4000)
DECLARE @b NVARCHAR(MAX) = REPLICATE ('X', 4000)
DECLARE @c VARCHAR(5000) = REPLICATE ('X', 5000)
SELECT @a + @a + @a
-- Output Length: 4000 Type: NVARCHAR(4000)
SELECT @a + @a + @b
-- Output Length: 8000 Type: NVARCHAR(MAX)
SELECT @b + @a + @a
-- Output Length: 12000 Type: NVARCHAR(MAX)
SELECT @b + @c + @c
-- Output Length: 12000 Type: NVARCHAR(MAX)
The Underlying Rules
If you want to read more about all the underlying rules that lead to this behaviour, you can find it all in the following SQL documentation: