All posts
Deep Dive · Blocking

Service Broker Blocking

SQL Server Service Broker poison message handling: how it disables a queue via CSbRollbackHandlerTask::DisableQ and why the blocked process report goes blind.

Tom · 4 min read
Service Broker Blocking

This post is dedicated to all 10 other DBAs that use Service Broker (you know who you are).

The main reason for this blog post is that I've got no hits on Google for CSbRollbackHandlerTask::DisableQ.
To improve my SEO I'll be the first to write about it (repeatedly) and get that sweet sweet Service-Broker-questions traffic.

Doc Ock palm of my hand meme where Sun is replaced with SEO

The problem

I was paged about a blocking chain where the blocked resource was a Service Broker queue, the lead blocked transaction was called CSbRollbackHandlerTask::DisableQ, the lock mode was SCH-M and even lock partitioning was involved. SCH-M is incompatible with every other lock mode, so any session trying to read from or activate the queue has to wait. I won't repro fully this time, but I've covered lock partitioning repro in my post on async stats update blocking. The goal was to capture an instance of transaction CSbRollbackHandlerTask::DisableQ with the same lock mode.

Repro

Let's create some Service Broker objects

CREATE DATABASE ServiceBrokerDemo

USE ServiceBrokerDemo
GO

-- Create the message type
CREATE MESSAGE TYPE SB_MessageType
VALIDATION = WELL_FORMED_XML
GO

-- Create the contract
CREATE CONTRACT SB_Contract
(
    SB_MessageType
    SENT BY INITIATOR
)
GO

-- Create the target queue and service
CREATE QUEUE SB_TargetQueue
GO

CREATE SERVICE SB_TargetService
ON QUEUE SB_TargetQueue
(SB_Contract)
GO

We'll need to get the object_id of the SB_TargetQueue to plug into the monitoring

SELECT OBJECT_ID('SB_TargetQueue')

The value 901578250 in the XE session below is from my instance - yours will differ, so run this first and swap in your result.

To monitor, I'll create an Extended Event session CSbRollbackHandlerTask

CREATE EVENT SESSION CSbRollbackHandlerTask ON SERVER 
ADD EVENT sqlserver.broker_queue_disabled
(
    SET collect_database_name = 1
    ACTION
    (
        sqlserver.is_system
    )
)
, ADD EVENT sqlserver.lock_acquired
(
    SET
        collect_database_name = 1
        , collect_resource_description = 1
    ACTION
    (
        sqlserver.is_system
    )
    WHERE 
        [object_id] = 901578250 /* insert Queue's object_id here */
        AND [mode] = 'SCH_M'
), ADD EVENT sqlserver.sql_transaction
(
    ACTION
    (
        sqlserver.is_system
    )
    WHERE
        [object_name] LIKE N'CSb%'
)
GO
ALTER EVENT SESSION CSbRollbackHandlerTask ON SERVER STATE = START

Note: XE uses SCH_M (with an underscore) in the mode predicate; in prose and DMV output you'll see it written as SCH-M.

Test manual disable

Can it be as simple as disabling the queue?

ALTER QUEUE SB_TargetQueue WITH STATUS = OFF

Extended Events grid after manually disabling the queue: 16 lock_acquired events in SCH_M mode, all is_system False, and no CSbRollbackHandlerTask transaction

The SCH-M lock is there (as expected, I'm disabling the queue) but no luck with the CSbRollbackHandlerTask::DisableQ transaction.

The difference: manual ALTER QUEUE ... STATUS = OFF is a user DDL transaction, so is_system comes back False. Poison message handling fires under an internal system task - and that distinction matters, as we'll see next.

Test poison message

The only other way to automatically disable a queue (I know of) is via poison message handling.
After I fail to process a message 5 times it gets marked as poisoned and the queue gets disabled (a safeguard so a failing message doesn't block the queue forever).

So I'll open up two sessions - one to send a message and another one where I'll repeatedly fail to process it.

DECLARE @conversation_handle uniqueidentifier
DECLARE @message_body xml

BEGIN DIALOG @conversation_handle
FROM SERVICE SB_TargetService
TO SERVICE N'SB_TargetService'
ON CONTRACT SB_Contract
WITH ENCRYPTION = OFF

SELECT @message_body = N'<SEO>CSbRollbackHandlerTask::DisableQ</SEO>'

SEND ON CONVERSATION @conversation_handle
MESSAGE TYPE SB_MessageType
(@message_body)

And in another session receive it in an infinite loop where I'll rollback each time and exit the loop on no messages processed. Don't forget to reenable the queue after Test 1 ALTER QUEUE SB_TargetQueue WITH STATUS = ON.

DECLARE @conversation_handle uniqueidentifier
DECLARE @message_body xml

WHILE 1 = 1
BEGIN
    BEGIN TRAN

    WAITFOR
    (
        RECEIVE TOP(1)
            @conversation_handle = conversation_handle
            , @message_body = CAST(message_body AS XML)
        FROM SB_TargetQueue
    )

    IF (@@ROWCOUNT = 0)
    BEGIN
        ROLLBACK
        BREAK /* break out of the loop on error */
    END
    
    ROLLBACK /*5 rollbacks in a row trigger poison message handling */
END

After five runs, I get an error message

Msg 9617, Level 16, State 1, Line 13
The service queue "SB_TargetQueue" is currently disabled.

Looking at the XE session

Extended Events grid after poison message handling fires: a sql_transaction named CSbRollbackHandlerTask::DisableQ and a broker_queue_disabled event, both with is_system True

I get both sql_transaction with the correct name - CSbRollbackHandlerTask::DisableQ and a broker_queue_disabled event. Unlike the manual disable, both entries have is_system = True - this is an internal system task, not a user transaction.

This is also why the blocked process report is useless here: when it detects a system process on either end, it stops giving useful details. The sad part is that it considers an Activation Stored Proc a background process and refuses to give me that information either. This and everything else is why debugging Service Broker issues isn't fun.


CSbRollbackHandlerTask::DisableQ

Thank you for reading

Tom
Tom, TSQL Dev

SQL Server consultant from Czechia.

Give me a problem where the answer isn't obvious and the evidence doesn't add up. That's my idea of a good time.