Coding Standards (T-SQL Tuesday #151)
This month’s invitation is from Mala Mahadevan, and it’s about coding standards. Fortunately, I may forget several basic ones because my team probably eradicated them long ago.
I’ll go with a rapid-fire strategy, just a long list of things to do and avoid with only a brief explanation. Some of these I’ve already covered in my rant.
Don’t do this 🔗︎
I’ll start with the don’t because the risk of doing something terrible usually outweighs doing something really well.
No scalar functions 🔗︎
Lots of people underestimate just how lousy scalar functions are.
- You force the SQL to process row by agonizing row (RBAR).
- It’s invisible to
SET STATISTICS IO, TIME ON, so you might not even know how bad it is.
- It prevents parallelism - even if the scalar function is in a view’s column, you are not referencing.
- It can be hidden in Computed Columns or Check Constraints.
Even though the topic is already covered by many, I might blog about it later.
No Merge 🔗︎
I’ll leave Aaron Bertrand and co to explain this.
No magic constants 🔗︎
WHERE p.ProductType <> 4
4? Just set a variable (constant) from a lookup table. Or write a comment with an explanation.
It’s the least you can do.
No meaningless wrappers 🔗︎
I mean brackets and parenthesis
WHERE ( ([p].[ProductType] <> 4) AND ([p].[CreateDate] >= '2022-06-12 00:00:00.000') )
could just be this
WHERE p.ProductType <> 4 AND p.CreateDate >= '2022-06-12 00:00:00.000'
There is no need for this extra eye strain. I only use brackets when the identifier turns blue.
Parenthesis should be added to help readability. Also, when the order of operations might not be clear, like when
OR is combined.
No business logic/validation in Triggers 🔗︎
Stay away from the Triggers. I have only two use cases for them:
- Scaffolding during refactoring
- Sync two copies of the same table
INSTEAD OFTrigger on a view)
- Enforcing audit columns (LastModificationDate, ModifiedBy)
- Even though now we can use Temporal tables
No capes! 🔗︎
Sorry, that’s just my favourite Incredibles quote. Not sure how that got here.
No ordinals 🔗︎
Apart from a quick ad-hoc query, never use this
GROUP BY 2, 4 ORDER BY 1, 3 DESC, 2
It’s very brittle. Someone will come and change the select order and screw up the logic.
It’s terrible for readability.
No old JOIN syntax 🔗︎
Even though I cannot remember the last time I saw it, don’t use the ANSI-89 JOIN syntax.
SELECT * FROM dbo.Customer AS c, dbo.OrderHeader AS oh WHERE oh.CustomerId = c.CustomerId
There is just no need to use this.
If you forget the join condition in the
WHERE clause, you have a Cartesian product (row count from
c × row count from
Always* do this 🔗︎
On the other hand, I usually recommend these (asterisk applies).
Alias everything 🔗︎
Especially table names. I prefer to use initials.
Nullability is specified 🔗︎
When declaring table variables or temp tables, always specify if the column can be
NULL. It helps to set the expectations about the data.
All constraints are named 🔗︎
Some schema compare tools are smart enough to disregard those. But if you script out two objects and compare them with a diff tool, you will find many differences if the constraints are not explicitly named.
Scripts should be idempotent 🔗︎
I had to search the term definition the first time around. It means that the script can be run many times but do that thing you wanted only once.
Unless, of course, you rely on the script to break if something is amiss.
Only regular identifiers 🔗︎
I’m not too fond of brackets, so don’t force me to use them.
You don’t have to type out brackets when using only regular identifiers.
As few hints as possible 🔗︎
They say: “Change is the only constant”, but it’s not true. So are the hints. Once they’re used, they usually stay. It’s a slippery slope telling SQL how it should behave - unless you are Paul White.
XACT_ABORT everywhere 🔗︎
SQL Server is not very consistent in handling transactions, so it needs all the help it can get. I’ll let Michael J Swart cover the rest.
Always be deterministic 🔗︎
If you ever wondered why you get the same result on two different pages, it’s probably because of determinism. So I usually add a surrogate PK to the
ORDER BY clause when I want to make sure to be deterministic.
The TOP clause has parentheses 🔗︎
A little-known fact is that the TOP operator accepts an expression. But it needs to be wrapped in parenthesis like so
[ TOP (expression) [PERCENT] [ WITH TIES ] ]
Everything is schema-qualified 🔗︎
Apart from the default schema issues, it helps differentiate real tables from table expressions.
But it can even lead to Compile locks blocking
One condition per line 🔗︎
It makes it easier to read and move conditions around.
Do you agree with my standards? Let me know and thank you for reading.