Query Hash and Query Plan Hash Mapping
The problem π︎
As a fan of Dynamic Management Objects (DMOs), Query Store and Extended Events, I often work with the Query Hash and Query Plan Hash (I’ll refer to them collectively as the Hashes).
But the issue is that DMOs and Query Store use a different data type for storage than Extended Events. So I’ll cover the possible ways of efficiently mapping those types.
Iβm expecting a certain familiarity with these concepts in this blog post. However, if you are not familiar, I recommend the article Query Fingerprints and Plan Fingerprints.
Data types π︎
DMOs π︎
These are all the DMOs that reference the Hashes.
SELECT
sc.object_id
, so.name AS objName
, so.type_desc
, sc.name AS columnName
, t.name AS typeName
, sc.max_length
FROM
sys.system_columns AS sc
JOIN sys.system_objects AS so
ON so.object_id = sc.object_id
JOIN sys.types AS t
ON t.system_type_id = sc.system_type_id
AND t.user_type_id = sc.user_type_id
WHERE
sc.name IN ('query_hash', 'query_plan_hash')
ORDER BY objName
We can see that all of these are stored as binary(8)
Query Plan π︎
You can also find the Hashes in the Actual or Estimated Query Plan. They are also stored as binary(8)
, and you can find them either in the plan properties (of the main node).
Or in the underlying XML.
Extended Events π︎
For Extended Events, some events reference the Hashes directly.
/* XE Event list */
SELECT
dxp.name AS packageName
, dxo.name AS eventName
, dxoc.name AS columnName
, dxoc.column_id
, dxoc.type_name
FROM
sys.dm_xe_objects dxo
JOIN sys.dm_xe_packages dxp ON dxo.package_guid = dxp.guid
JOIN sys.dm_xe_object_columns AS dxoc
ON dxoc.object_name = dxo.name
AND dxoc.object_package_guid = dxo.package_guid
AND dxoc.column_type <> N'readonly'
WHERE
dxo.object_type = 'event'
AND dxoc.name LIKE 'query_%hash%'
But when the Extended Events doesn’t collect the Hashes by default, you can add them (where applicable) with the Global Fields/Actions.
/* XE Global Action list */
SELECT
CONCAT(dxp.name, '.', dxo.name) AS completeName
, dxo.name AS ActionName
, dxo.type_name
, dxo.description
FROM
sys.dm_xe_objects dxo
JOIN sys.dm_xe_packages dxp ON dxo.package_guid = dxp.guid
WHERE
dxo.object_type = 'action'
and dxo.name LIKE 'query_%hash%'
In the Actions and Events, we can see that the most common data type is the int64
for the _signed
version of the Hashes and uint64
for the unsigned.
The int64
translates to the bigint
SQL Server data type.
So which one should you use?
⚠️ Spoiler alert: always use the
_signed
.
The mapping π︎
When the data types don’t match, you must convert at least one side.
If you match a Hash constant against a table, converting the constant is more efficient than the column.
It depends on the use case, so let’s cover a few of them.
Create a test environment π︎
For our demo, we’ll create three things
- A database with a Query Store enabled
- A procedure with several statements (so we can filter)
- An Extended Events monitor
Here’s a script that creates all three.
CREATE DATABASE QueryHash
ALTER DATABASE QueryHash
SET QUERY_STORE = ON (QUERY_CAPTURE_MODE = ALL)
GO
USE QueryHash
GO
CREATE OR ALTER PROCEDURE dbo.FindMyHash
AS
BEGIN
/* statement 1 */
SELECT * FROM sys.objects AS o
WHERE o.create_date > DATEADD(DAY, -1, GETDATE())
/* statement 2 */
; -- previous stmt terminator
WITH
L0 AS(SELECT 1 AS c UNION ALL SELECT 1),
L1 AS(SELECT 1 AS c FROM L0 CROSS JOIN L0 AS B),
L2 AS(SELECT 1 AS c FROM L1 CROSS JOIN L1 AS B),
Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L2),
tally AS (SELECT TOP (10) n FROM Nums ORDER BY n)
SELECT
n
FROM tally
/* statement 3 */
SELECT
d.name
, d.database_id AS dbId
, d.log_reuse_wait_desc
, d.compatibility_level AS compat
, d.user_access_desc AS access
, d.state_desc
, d.snapshot_isolation_state_desc AS snpsht
, d.is_read_committed_snapshot_on AS rcsi
, d.recovery_model_desc
, d.page_verify_option_desc AS pageVerify
, d.is_fulltext_enabled AS fulltxt
, d.is_trustworthy_on AS trstworthy
, d.is_db_chaining_on AS dbChaining
, d.is_query_store_on AS QS
, d.is_broker_enabled AS brkr
, d.is_cdc_enabled AS cdc
, d.is_encrypted AS encr
, d.delayed_durability_desc AS delayedDur
, d.is_result_set_caching_on
, d.is_memory_optimized_enabled
FROM sys.databases AS d
END
GO
/* Create a XE monitoring session and start it */
CREATE EVENT SESSION [FindMyHashXE] ON SERVER
ADD EVENT sqlserver.sp_statement_starting
(
SET
collect_object_name=1
, collect_statement=1
ACTION
(
sqlserver.query_hash /* to showcase the difference */
, sqlserver.query_hash_signed
, sqlserver.query_plan_hash /* to showcase the difference */
, sqlserver.query_plan_hash_signed
)
WHERE
[object_name] = N'FindMyHash'
)
GO
ALTER EVENT SESSION FindMyHashXE ON SERVER STATE = START
binary(8) to bigint π︎
The XE session FindMyHashXE
is started, but we don’t persist the data. Therefore we have to open the Watch Live Data view to collect the events.
Next, let’s run the Procedure couple of times.
EXEC dbo.FindMyHash
GO 10
/* GO 10 = repeats the batch 10 times */
Let’s say I’m interested in the second statement.
I could grab the Hashes from the Actual Execution Plan (Estimated doesn’t really work for the Stored Procedure call).
Here’s the XML excerpt. The QueryHash and QueryPlanHash attributes are in the middle.
<StmtSimple
StatementCompId="4"
StatementEstRows="10"
StatementId="2"
StatementOptmLevel="FULL"
StatementOptmEarlyAbortReason="GoodEnoughPlanFound"
CardinalityEstimationModelVersion="150"
StatementSubTreeCost="9.7468E-05"
StatementText="WITH
 L0 AS(SELECT 1 AS c <shortened>β¦"
StatementType="SELECT"
QueryHash="0x332D6151BE597D59"
QueryPlanHash="0x00DDF16427C45420"
RetrievedFromCache="true"
StatementSqlHandle="0x090042CC626E212F5B6C7B803CDB7A69839A0000000000000000000000000000000000000000000000000000"
DatabaseContextSettingsId="2"
ParentObjectId="581577110"
StatementParameterizationType="0"
SecurityPolicyApplied="false"
>
Copy out the Hashes manually and CAST
them to bigint
.
SELECT
CAST(0x332D6151BE597D59 AS bigint) AS query_hash_bigint,
CAST(0x00DDF16427C45420 AS bigint) AS query_plan_hash_bigint
/* 3687710673600085337 and 62471382319256608 respectively */
Or even better, we can grab all the information straight from the Query Store.
SELECT
qsq.query_id
, qsp.plan_id
, qsq.query_hash
, CAST(qsq.query_hash AS bigint) AS query_hash_bigint
, qsp.query_plan_hash
, CAST(qsp.query_plan_hash AS bigint) AS query_plan_hash_bigint
, qsqt.query_sql_text
FROM sys.query_store_query AS qsq
JOIN sys.query_store_plan AS qsp
ON qsp.query_id = qsq.query_id
JOIN sys.query_store_query_text AS qsqt
ON qsqt.query_text_id = qsq.query_text_id
WHERE qsq.object_id = OBJECT_ID(N'dbo.FindMyHash')
Now we can go back to the Extended Events view and filter on the query_hash_signed
column using the bigint
value 3687710673600085337.
We opted to collect all executions in this demo and then filter them afterwards. But if you want to capture only a specific hash, you can add it as a filter in the XE session definition.
Let’s try the other scenario.
Bigint to binary(8) π︎
The tables have turned. Now you’ve noticed some problems with a query in the XE monitor, and you want to correlate with the Query Store to get more information.
First, we’ll clear the XE filter, and this time we’ll grab the query_plan_hash_signed
(7889493896768043449) for the third statement.
We’ll plug this value into the query from above. But since it’s not worth converting the whole column to match against a constant, we’ll slightly update the statement.
SELECT
qsq.query_id
, qsp.plan_id
, qsq.query_hash
, CAST(qsq.query_hash AS bigint) AS query_hash_bigint
, qsp.query_plan_hash
, CAST(qsp.query_plan_hash AS bigint) AS query_plan_hash_bigint
, qsqt.query_sql_text
FROM sys.query_store_query AS qsq
JOIN sys.query_store_plan AS qsp
ON qsp.query_id = qsq.query_id
JOIN sys.query_store_query_text AS qsqt
ON qsqt.query_text_id = qsq.query_text_id
WHERE qsp.query_plan_hash = CAST(7889493896768043449 AS binary(8))
But wait! We didn’t get back any result.
And the reason is the Data Type Inference, as you can see in this demo.
DECLARE @bigintHash bigint = 7889493896768043449
SELECT
CAST(@bigintHash AS binary(8)) AS bigintToBin, /* 0x6D7D1CE61678B9B9 */
CAST(7889493896768043449 AS binary(8)) AS constToBin /* 0x13000001B9B97816 */
We have two same values on the input but two different hashes. One type is explicitly defined, while the other is inferred.
So what is this mysterious data type? To find out, I’ll use my favourite data type - sql_variant
.
DECLARE @test AS sql_variant = 7889493896768043449
SELECT
SQL_VARIANT_PROPERTY(@test, 'BaseType') AS BaseType
, SQL_VARIANT_PROPERTY(@test, 'Precision') AS Precision
, SQL_VARIANT_PROPERTY(@test, 'Scale') AS Scale
, SQL_VARIANT_PROPERTY(@test, 'TotalBytes') AS TotalBytes
, SQL_VARIANT_PROPERTY(@test, 'MaxLength') AS MaxLength
, SQL_VARIANT_PROPERTY(@test, 'Collation') AS Collation
, CAST(@test AS binary(8)) AS variantToBin
We can see that the underlying data type is numeric(19,0)
. Casting it to binary(8)
returns the same binary value as in the example with the constant.
So how to fix our Query Store mapping? There are two options.
Use a bigint
variable and store the value there.
DECLARE @queryPlanHash bigint = 7889493896768043449
SELECT
β¦
WHERE qsp.query_plan_hash = CAST(@queryPlanHash AS binary(8))
Or if you want to do it in a single statement - explicitly cast the constant as a bigint
.
SELECT
β¦
WHERE qsp.query_plan_hash = CAST(CAST(7889493896768043449 AS bigint) AS binary(8))
Both of these will get you the result.
The unsigned int π︎
I don’t know if there is a use case for this, but it deserves mention for the sake of completeness.
You can find the detailed information in the article Correlating XE query_hash and query_plan_hash to sys.dm_exec_query_statsβ¦.
I’ll do a simplified version with the just the calculation. I will pick the first statement because I need an example where query_hash <> query_hash_signed
.
As a reminder - these are our values:
field | value |
---|---|
query hash binary(8) | 0xBA6ED813C4878164 |
query hash (unsigned) | 13433912317905961316 |
query hash signed | -5012831755803590300 |
I don’t recommend this because we have to do the conversion on both sides of the mapping. We have to:
- Remove the most significant bit (MSB) from the binary value.
- Convert the unsigned number to a
bigint
.
To remove the MSB we’ll use the Bitwise AND (&
) with the max bigint value
which is (2^63)-1
(9,223,372,036,854,775,807) and can also be represented in binary as 0x7FFFFFFFFFFFFFFF
.
Because of this
In a bitwise operation, only one expression can be of either binary or varbinary data type.
we’ll convert one side of the equation to bigint
, and due to Data type precedence, that will also be the resulting data type.
As for the unsigned query hash, we’ll subtract the maximum negative value of bigint
, which is -2^63
(-9,223,372,036,854,775,808).
To put it all together.
SELECT
0xBA6ED813C4878164 AS bin
, CAST(0xBA6ED813C4878164 AS bigint) & 0x7FFFFFFFFFFFFFFF AS binWithoutMSBtoBigint
, 13433912317905961316 AS unsignedBigint
, 13433912317905961316 - 9223372036854775808 AS signedBigint
, CASE WHEN
CAST(0xBA6ED813C4878164 AS bigint) & 0x7FFFFFFFFFFFFFFF =
13433912317905961316 - 9223372036854775808
THEN 1
ELSE 0
END AS isSame
Plugging this equation back into our Query Store statement would look like this.
SELECT
β¦
WHERE
CAST(qsq.query_hash AS bigint) & 0x7FFFFFFFFFFFFFFF = (13433912317905961316 - 9223372036854775808)
Which works, but it’s just terrible for the performance and readability. So please stick to the signed version of the Hashes!