All posts
How to · Blocking

Deploy at Low Priority

Deploy SQL Server schema changes without a maintenance window using a WAIT_AT_LOW_PRIORITY-inspired loop that checks for SCH-S locks and bails out with GOTO.

Tom · 5 min read
Deploy at Low Priority

The problem

I attempted to deploy a trigger change on a load-bearing table in a busy system. I was repeatedly blocked and, while waiting for the SCH-S lock, I ended up blocking other queries. I group blockers into two categories based on duration:

  • Long
    • You have enough time to detect the block and query WhoIsActive or another DMV utility to identify the blocker.
    • Can you kill it? It might be critical for business or have a long rollback time.
  • Short
    • These blockers come and go so quickly you might not even notice them.
    • You're usually blocked due to lock partitioning and your session running on a scheduler with a higher ID. (as I cover in Async stats update causing blocking)

Ideally, deployments should happen when there's no active locking on the object. This reminded me of WAIT_AT_LOW_PRIORITY feature of the online index rebuilds.

It would be great if this option were available for all schema changes, but since it isn't, I decided to build my own version.

Building blocks

To build this utility, we'll need:

  • A time-based loop (infinite loops can go wrong)
  • Check for incompatible locks
  • The actual code to be deployed
    • Most of the code deployments are required to be in a separate batch (in SSMS the batch separator is GO by default)
  • End gracefully if you cannot get the lock in the defined time
  • Any other safety measures

The main challenge was how to put the deployment script in a separate batch. The first method that came to mind was using dynamic SQL, but that would mean using dynamic SQL (and all its pitfalls like double apostrophes, formatting issues, lack of syntax highlighting, etc.). That's why I chose the rarely used GOTO.

Note This gives you one GO-bordered deploy slot. If your change spans multiple batches, you will need to wrap each batch in its own copy of the loop.

The script

SET DEADLOCK_PRIORITY LOW
SET XACT_ABORT ON
SET LOCK_TIMEOUT 2000 /* milliseconds */

SET NOEXEC OFF /* resets the noexec on that's set below */
DECLARE @startTime DATETIME2(0) = SYSDATETIME()
DECLARE @max_duration INT = 5 /* minutes */

/* <-- INSERT THE FULL schema.objectname */
DECLARE @objectId BIGINT = (SELECT OBJECT_ID(''))

/* Assuming the same DB context, but feel free to modify */
DECLARE @dbId INT = (SELECT DB_ID())

RAISERROR ('Starting the wait loop', 0, 1) WITH NOWAIT
WHILE DATEADD(MINUTE, @max_duration, @startTime) > SYSDATETIME()
BEGIN

    IF NOT EXISTS (  /* Check if an incompatible lock exists */
      SELECT TOP 1
        1
      FROM sys.dm_tran_locks AS dtl
      WHERE
          dtl.resource_database_id = @dbId
          AND dtl.resource_associated_entity_id = @objectId
          AND dtl.request_status = N'GRANT'
          AND dtl.request_mode = 'SCH-S'
    )
    BEGIN
        GOTO gotta_go_fast
    END

    WAITFOR DELAY '00:00:00.200' /* Format: hh:mm:ss.mss */
END

/* after the timeloop finishes without success, prevent execution of further code*/
    RAISERROR ('Time loop finished without success', 0, 1) WITH NOWAIT
    SET NOEXEC ON

gotta_go_fast:

/*
###############################################################################
  Insert the deploy code in the batch block below (bordered by GO separators) 
###############################################################################
*/
GO

GO
/*
###############################################################################
  The end of the deploy batch block
###############################################################################
*/

Explanation

Warning Test lock-checking performance before running it in a loop. Also, ensure the actual execution plan is disabled.
  • I start with some safety measures in a form of deadlock priority and lock timeouts
  • Variable values are in the subquery format so you can highlight and check the values before running
  • Inside the time loop I have the SCH-S lock check
    • The TOP 1 1 is not necessary for the EXISTS but in case you want to test the subquery with values instead of variables
  • If no lock is detected, jump to the gotta_go_fast: label which is right before the batch separator GO
    • You can test around which WAITFOR delay works best for you. The goal is to limit the number of loops so it's not too resource intensive
  • If you cannot find a window of opportunity within the @max_duration minutes we'll set NOEXEC ON to prevent any further code from running
    • SET NOEXEC OFF is at the start of the code to reset to the original state
  • Do not forget the trailing GO to properly terminate the batch
Note The lock check and the schema change are not atomic - a session can acquire SCH-S in the microseconds between the check passing and your ALTER acquiring Sch-M. SET LOCK_TIMEOUT 2000 means the deploy fails fast rather than blocking indefinitely, but it does not eliminate the window. It reduces it.

Additional help

The lock partitioning and schedulers play an important role when it comes to acquiring locks as mentioned in my other blog post.
It helps if we run our deploy at low priority script on the highest available scheduler. That way no other process can "jump the queue" and block our process.

To check current and max scheduler we can use this snippet

SELECT
    dot.scheduler_id as currentSchedulerId
    , dosi.scheduler_count as totalSchedulerCount
    , dot.session_id
FROM sys.dm_os_tasks AS dot
CROSS JOIN sys.dm_os_sys_info as dosi
WHERE dot.session_id = @@spid

If we get a low scheduler, we can just try to disconnect/reconnect to get a higher one. You can keep two sessions open and "reroll" the lower one until you get there.

A more aggressive version of the script

If some blocking is acceptable, replace the no-lock check with a lock threshold check and change the lock timeout to something longer. Unlike ABORT_AFTER_WAIT = BLOCKERS, neither version kills any sessions - the "aggressive" here means proceeding when the active lock count is low enough, not force-evicting blockers.

SET LOCK_TIMEOUT 8000
…
IF /* test for a lock count */
(  
    SELECT COUNT(1)
    FROM sys.dm_tran_locks AS dtl
    WHERE
        dtl.resource_database_id = @dbId
        AND dtl.resource_associated_entity_id = @objectId
        AND dtl.request_status = N'GRANT'
        AND dtl.request_mode = 'SCH-S'
) <= 5
…

Thank you for reading

Tom
Tom, TSQL Dev

SQL Server consultant from Czechia.

Give me a problem where the answer isn't obvious and the evidence doesn't add up. That's my idea of a good time.