This article intends to give some beneficial suggestions that help to write a more readable T-SQL query.
Introduction
The major expectation from a code is that it works properly and generates the expected outputs. However, this expectation is not a sufficient criterion to say that this code is good because modern applications are now developed by teams that include many developers. Therefore, a developer can not hide his codes from the other developers forever because a code may require reading, review, editing, or maintenance by the other developers. At this point, the understanding of a code strictly depends on its readability.
Writing a readable T-SQL query
As abovementioned, a T-SQL query will be also re-read by many other database developers and it will be required editing. When a query is not readable, the database developer will spend more time reading the query than changing the query. The worst scenario is, they might misunderstand the code and change it in ways it was not meant to be changed. Mainly the following rules help to increase the readability of a T-SQL query:
- Formating
- Using comments
- Apply the DRY (Don’t Repeat Yourself) principle
Formating a T-SQL query
Query formatting can be defined as transforming the query codes into a well-organized and easily readable format. A well-formatted query is effortlessly read and figure out by the developers so that it helps to ease the following works:
- Maintaining the application codes
- Debugging and bug fixing process
We can see the messiness of the query below and it’s clear that this situation causes the code very difficult to read. All descriptions, table names, and column names are in disorganized form.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
DECLAre @NewValue as varchar(100)declare @I AS INT=0 if OBJECT_ID( N'tempdb..#NewTempTable')is nOT NULL BEgin dROP TABLE #NewTempTable END CREATe table #NewTempTable(Id inT,Column1 INT) IF @NewValue<=0WHILE @I<=10 BEgiN INSErt inTO #NewTempTable valUEs( @NewValue,@NewValue*100)seT @I=@I+1 END IF @NewValue>0 begin select p.ProductID,p. ProductNumber,p.Name,s. CarrierTrackingNumber,h.AccountNumber,h. CreditCardApprovalCode,dbo.[ufnGetStock](p.ProductID)as Stock,case when AccountNumber likE'10%' THEN'Account New'ELSE'Account Old'END As 'AccountRename' ,concat(Substring( CarrierTrackingNumber ,1,4),Substring(p.Class,1,4))froM Sales.SalesOrderDetailEnlarged s Inner jOIN Production.Product p On s.ProductID=p.ProductID inNER JOin Sales. SalesOrderHeaderEnlarged h on h.SalesOrderID=s.SalesOrderID where s.OrderQty>2aND LEN( CreditCardApprovalCode)>10ORDer by conCAT(Substring(CarrierTrackingNumber,1, 4),Substring(p.Class,1,4)),ProductID desc end |
To make this code more readable, we can use the various online SQL formatter tools or can use 3rd party SQL formatted add-ins for SQL Server Management Studio (SSMS). Such as, we can use an Online SQL formatter or SSMS add-in of it.
For the Azure Data Studio, some extensions can help to format a T-SQL query. Poor SQL Formatter can be an option to re-format queries in Azure Data Studio.
Add sufficiently descriptive comments to a T-SQL query
The “Your code should be self-documenting” is the very famous catchword between the developers and it means rather than using comments, your code should legible enough to decrease the need for comments. Maybe this approach forces the developers to write more readable codes but each developer may have different coding behavior therefore adding comments will always increase the readability of the code.
Adding some descriptive comments at the beginning of a T-SQL query will also improve the readability of the query codes. This comment can shortly explain which records these query fetches, business logic, table, and schema descriptions. Besides, we can add the project management tool issue number. The main characteristic of a comment is that it can explain query details to everyone shortly and clearly. The following query fetches some rows from the Production table and also performs a calculation for each product.
1 2 3 4 5 6 7 8 9 |
SELECT Production.Product.Name, Production.Product. ProductNumber,(SElect SUM(OrderQty*UnitPrice) from Sales.SalesOrderDetail where Sales. SalesOrderDetail.ProductID= Production.Product.ProductID)AS SalesTotal from Production. Product |
Now, we re-format the query codes then add some descriptive comments for the query and column names. We create borders above and below of the query description comment and then we add the query description into it. Also, we add the column descriptions near the column names.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
/*************************************************************************************** Query Description: This query returns the main data of the product and calculates total order amount for each of them ****************************************************************************************/ SELECT P.Name --Name of the product. , P.ProductNumber --Unique product identification number. , ( SELECT SUM ( OrderQty --Quantity ordered per product. * UnitPrice --Selling price of a single product. ) FROM Sales.SalesOrderDetail SalesLine WHERE SalesLinE.ProductID = P.ProductID ) AS SalesTotal FROM Production.Product P |
Using column descriptions
After designing the schema of a database, it is another important point to document it. Sometimes this step can be skipped as it is deemed laborious and unnecessary by developers and database administrators. However, schema documentation is a good guideline for the next developers when they need to resolve some issues about the designed database. At the least, adding column descriptions helps to understand the usage purpose of that column in the table it belongs to.
“Add column descriptions of the tables which are created by you if you want to leave a treasure for the next developers.”
Another benefit of column descriptions is that we can add them as comments to queries to identify the column names. At first, we glance how to create column descriptions. We can create column descriptions very easily using SSMS. To do this, we find the table for which we will create the column descriptions in Object Explorer. Select a table using a right-click and choose Design.
As a next step, we select the column to which we want to add a description. All properties of the selected column will be activated in the Column Properties tab. In this tab, we find the description field and add a short and descriptive explanation for that column.
After all that, we click the Save button or press the Ctrl + S key combination. To see this column’s descriptions we can use the following query.
1 2 3 4 5 6 7 8 9 10 11 12 |
select st.name [Table], sc.name [Column], sep.value [Description] from sys.tables st inner join sys.columns sc on st.object_id = sc.object_id left join sys.extended_properties sep on st.object_id = sep.major_id and sc.column_id = sep.minor_id and sep.name = 'MS_Description' WHERE st.name ='CountryListCensus' |
Extended Properties is a feature in SQL Server to store more information about database objects. We can use this feature to add more descriptions about the tables. To add an extended property for a table, we right-click on the table and then choose Properties. On the table properties window, we click the Extended Properties and add the description of a table. At first, we give a name and then add the value of it.
We can obtain the extended event properties to use the following query.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
SELECT ep.* FROM sys.extended_properties ep LEFT JOIN sys.objects AS o ON o.object_id = ep.major_id LEFT JOIN sys.schemas AS sch ON sch.schema_id = o.schema_id LEFT JOIN sys.columns AS clmns ON clmns.column_id = ep.minor_id AND clmns.object_id = ep.major_id WHERE major_id = object_id('CountryListCensus') AND clmns.object_id IS NULL |
Use alias to enhance the query readability
We can assign temporary names to tables or columns during the execution of a query and it improves the query readability. An alias will be very useful when a table name is very long and incomprehensible because we can easily convert them more legible. Such as, if we are working on an SAP database, the table names are not easily understood. The following query fetches some material stock movements.
1 2 3 4 5 6 7 8 9 10 11 |
declare @datefrom as date declare @dateto as date select MKPF. MANDT,MKPF.MBLNR,MKPF.MJAHR,SAP.MARA.MARA,SAP.MARA.MATKL,SAP. mseg.MENGE,SAP.mseg.BWART,SAP.mseg .LGORT from SAP.mseg JOIN SAP.MKPF ON MKPF. mandt=MSEG.mandt AND MKPF. MJAHR=MSEG.MJAHR and MKPF.MBLNR=MSEG.MBLNR join SAP.MARA on MARA. MANDT=MSEG.MANDT and MARA.MATNR=MSEG.MATNR wHERE MKPF.CPUDT BETWEEN @datefrom AND @dateto |
As we can see, the query is not easily understood when it is read because :
- Does not include any comment about the query, tables, variables, and columns
- The query is not well-formatted.
- The table names are directly used
Now we will fix these shortcomings. Firstly, we need to add shortly descriptive comments about the query and other objects. After formating the query and assigning the aliases to the table name the query will become more readable.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
/*************************************************************************************** Query Description: This query aims to retrieve stock movements between a certain date. ****************************************************************************************/ DECLARE @DateFrom AS DATE DECLARE @DateTo AS DATE SELECT MatHeader.MBLNR ----Number of material document , MatHeader.MJAHR ----Material document Year , MatData.MATNR ----Material number , MatData.MATKL ----Material oldNumber , MatMovement.MENGE ----Movement amount , MatMovement.BWART ----Movement type , MatMovement.LGORT ----Movement warehouse number FROM SAP.MSEG MatMovement--- MSEG stores the material movement details JOIN SAP.MKPF MatHeader ----MKPF is an standart Material Managment which stores header data of the material movements ON MatHeader.mandt = MatMovement.mandt AND MatHeader.MJAHR = MSEG.MJAHR AND MatHeader.MBLNR = MSEG.MBLNR JOIN SAP.MARA MatData ----MARA stores general material data ON MARA.MANDT = MSEG.MANDT AND MARA.MATNR = MSEG.MATNR WHERE MKPF.CPUDT BETWEEN @DateFrom AND @DateTo |
DRY (Don’t Repeat Yourself) principle
The main goal of the DRY principle is to reduce the repetition of writing the same code snippets. We can use the user-defined functions to implement this principle into a T-SQL query because user-defined functions allow us to encapsulate queries and invoke them from anywhere. Mainly, there are 3 different user-defined function types:
The Scalar valued function returns a single value.
The Multi-Statement Table-Valued function returns a table as output and this output table structure can be defined by the user. MSTVFs can contain only one statement or more than one statement.
The Inline Table-Valued function is a user-defined function that returns a table data type and also it can accept parameters.
The following query returns the Production table product name, product number, and sales total of each product.
1 2 3 4 5 6 |
SELECT Name,ProductNumber, (SELECT ISNULL(SUM(LineTotal),0) FROM Sales.SalesOrderDetail Sales WHERE Sales.ProductID =P.ProductID) AS SalesTotal FROM Production.Product P |
Now, we will transform this query into a more readable one. At first, we create a scalar-valued function instead of the subquery.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
CREATE FUNCTION Sales.CalculateProductSales (@ProductId INTEGER) RETURNS FLOAT /*************************************************************************************** Function Description: This function calculates the total sales for each product ****************************************************************************************/ AS BEGIN DECLARE @Total FLOAT SELECT @Total = ISNULL(SUM(LineTotal), 0) FROM Sales.SalesOrderDetail WHERE ProductID = @ProductId RETURN @Total END |
As the next step, we will implement this scalar-valued function into the query and we also add the required comments into the query.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
/*************************************************************************************** Query Description: This query aims to fetch each product sales amount ****************************************************************************************/ SELECT P.Name , --Name of the product P.ProductNumber , --Unique product identification number. Sales.CalculateProductSales(P.ProductID) AS SalesTotal FROM Production.Product P; --Products sold or used in the manufacturing of sold products. |
Don’t use the “SELECT * “ statement
A “SELECT *” statement returns all columns of the queried table(s) and it has very easy usage. However, we have to specify the column names explicitly in our queries. The reason for this rule is to clarify which columns are needed by a query so it will help with the readability of the query. Certainly, there are also other benefits to avoid ‘SELECT *’ usage:
- Prevent the unnecessary I/O operations
- Increase the network traffic
Summary
In this article, we learned some useful methodologies that help to improve the readability of a T-SQL query.
- SQL Performance Tuning tips for newbies - April 15, 2024
- SQL Unit Testing reference guide for beginners - August 11, 2023
- SQL Cheat Sheet for Newbies - February 21, 2023