Expecting Subvertations
I’ve seen a peculiar problem on Twitter about a straightforward code that produced an unexpected result. It grabbed my interest and I’ve tried to solve it myself.
It’s important to note, that I have been running this query on SQL Server 2019 (major version 15). The query looked something like this:
SELECT
SERVERPROPERTY('ProductMajorVersion') AS MajorVersion,
CASE
WHEN SERVERPROPERTY('ProductMajorVersion') < 13
THEN 'True'
ELSE 'False'
END AS LogicalTest
If it’s the first time you’re seeing a problem like this, it can be confusing.
How can 15
be less than 13
?
My first train of thought was the same as the original poster’s.
Ok, so both values are converted to varchar
according to data type precedence - similar to this:
SELECT
CASE
WHEN '12' < '2'
THEN 'True'
ELSE 'False'
END AS LogicalTest
Notice the quotes around the number. I’m not comparing numbers but numbers stored as a varchar
.
Because string comparison works on a character basis, '2'
is larger than '1'
(as 'B'
would be larger than 'A'
). The second character doesn’t matter.
But my theory had three major flaws.
First flaw π︎
'15'
isn’t smaller than '13'
. The first character '1'
is the same in both strings and second character '5'
is larger than '3'
.
SELECT
CASE
WHEN '15' < '13'
THEN 'True'
ELSE 'False'
END AS LogicalTest
Second flaw π︎
I haven’t checked the Data type precedence.
The int
is actually higher than the varchar
, so the varchar
would be converted to int
(if possible).
SELECT
CASE
WHEN '15' < 13
THEN 'True'
ELSE 'False'
END AS LogicalTest
Third flaw π︎
I haven’t checked the return type of the SERVERPROPERTY function.
It’s a sql_variant
and not varchar
, which makes sense. It can have several different data types returned for different properties.
And since sql_variant
type is almost at the top of the precedence list (one of the reasons it’s my favourite data type), the int
value will also be converted to sql_variant
.
This has been a perfect storm of bad assumptions. Now, let’s correct those.
sql_variant comparison π︎
Like Asimov’s Laws of Robotics, sql_variant
has also three rules for comparison.
Taken from the documentation (emphasis mine):
First rule π︎
When sql_variant values of different base data types are compared and the base data types are in different data type families, the value whose data type family is higher in the hierarchy chart is considered the greater of the two values.
That means values don’t matter, only data type families.
Let’s confirm the base data types of our values.
SELECT
SQL_VARIANT_PROPERTY
(
SERVERPROPERTY('ProductMajorVersion')
, 'BaseType'
) AS BaseTypeServerProperty
, SQL_VARIANT_PROPERTY
(
13
, 'BaseType'
) AS BaseTypeNumber
Which returns nvarchar
and int
respectively. Now we have to match them with their data type family.
I have aggregated the table so the data type familes are unique (original table is in the documentation link). The data types in rows are also ordered.
# | Data type family | Base data type |
---|---|---|
1 | sql_variant | sql_variant |
2 | Date and time | datetime2, datetimeoffset, datetime, smalldatetime, date, time |
3 | Approximate numeric | float, real |
4 | Exact numeric | decimal, money, smallmoney, bigint, int, smallint, tinyint, bit |
5 | Unicode | nvarchar, nchar, varchar, char |
6 | Binary | varbinary, binary |
7 | Uniqueidentifier | Uniqueidentifier |
The int
is higher in the table than the nvarchar
and is therefore considered greater of the two.
I can repeat the test with an int
and a datetime
.
DECLARE @dtVar sql_variant = GETDATE()
DECLARE @intVar sql_variant = 9999999
SELECT
CASE
WHEN @intVar < @dtVar
THEN 'True'
ELSE 'False'
END AS LogicalTest
Which evaluates to True
. That proves it and with this, our mystery is solved!
But just for fun, let’s do the remaining two rules.
Second rule π︎
When sql_variant values of different base data types are compared and the base data types are in the same data type family, the value whose base data type is lower in the hierarchy chart is implicitly converted to the other data type and the comparison is then made.
This doesn’t apply to the Unicode family as per the rule below. So let’s test with two different Exact numeric base types.
DECLARE @bigInt sql_variant = CAST(1 AS bigint)
DECLARE @intVar sql_variant = CAST(2 AS int)
SELECT
CASE
WHEN @bigInt < @intVar
THEN 'True'
ELSE 'False'
END AS LogicalTest
The int
value is considered greater (because it is) than the bigint
value even though the latter is higher in the data type table.
Now it’s actually comparing values.
Third rule π︎
When sql_variant values of the char, varchar, nchar, or nvarchar data types are compared, their collations are first compared based on the following criteria: LCID, LCID version, comparison flags, and sort ID. Each of these criteria are compared as integer values, and in the order listed. If all of these criteria are equal, then the actual string values are compared according to the collation.
I won’t do all of these combination, I’ll just try the first one - LCID.
SELECT COLLATIONPROPERTY('French_CI_AI', 'LCID')
SELECT COLLATIONPROPERTY('Latin1_General_CI_AS', 'LCID')
DECLARE @French sql_variant = N'abc' COLLATE French_CI_AI
DECLARE @English sql_variant = N'abc' COLLATE Latin1_General_CI_AS
SELECT
CASE
WHEN @English < @French
THEN 'True'
ELSE 'False'
END AS LogicalTest
The LCID for French is 1036 which is greater than 1033 for English. That proves @French
is greater than @English
.
Final thought π︎
To prevent these shenanigans, embrace the defensive programming.
If you have an assumption - either enforce it or verify it.
In this case, I would explicitly CAST
the SERVERPROPERTY
to an int
data type like so.
SELECT
SERVERPROPERTY('ProductMajorVersion') AS MajorVersion,
CASE
WHEN CAST(SERVERPROPERTY('ProductMajorVersion') AS int) < 13
THEN 'True'
ELSE 'False'
END AS LogicalTest