This article explores the SQL Server TOP clause using various examples, along the way, we will also run through performance optimization while using the TOP clause in SQL Server.
Introduction
Sometimes we want to retrieve a specific number of records from the SQL table rather than retrieving all table records. We can use the WHERE clause in a SELECT statement to limit the number of rows. Suppose you have a customer table and you want to retrieve records for customers belonging to a particular country. There might be many records satisfying the condition. We require top ‘N’ customer records to satisfy the conditions. For this requirement, we can use TOP or ROW_Number() clauses.
Let’s go ahead and explore the TOP clause and its performance aspects.
SQL Server TOP clause syntax
SELECT TOP Expression | Percentage [Column_Names] [ WITH TIES ]
FROM [Table_Name]
- Expression: We can specify a numeric expression that defines the number of rows to be returned
- PERCENT: Here, we can use PERCENT value to return the percentage number of rows from total rows in the output
- WITH TIES: We can specify the WITH TIES clause to return more rows with values that match the last row in the result set. For example, we want the top 10 customers based on the purchase amount. If we do not use WITH TIES, SQL Server returns exactly 10 records while we might have more customers with a similar purchase cost. In this case, we can use WITH TIES to return more rows. We must use WITH TIES along with the ORDER BY Clause
Let’s look at a few examples of the SQL Server TOP clause for [HumanResources.Employee] table in the AdventureWorks sample database.
Example 1: SQL Server TOP Clause with a constant value
In this example, we retrieve the top 10 records from a SQL table without specifying any condition and ORDER BY Clause.
1 2 3 |
SELECT TOP (10) * FROM HumanResources.Employee; GO |
Let’s filter the output and retrieve only relevant columns. We can specify the required column names in the select statement.
1 2 3 4 5 6 |
SELECT TOP (10) NationalIDNumber, JobTitle, BirthDate, HireDate FROM HumanResources.Employee; GO |
In the next query, we specify an ORDER BY Clause to sort results in descending order of birthdate column.
1 2 3 4 5 6 7 |
SELECT TOP (10) NationalIDNumber, JobTitle, BirthDate, HireDate FROM HumanResources.Employee ORDER BY BirthDate DESC; GO |
If we do not specify the DESC clause, it sorts result in ascending order, and the top statement shows the required data.
1 2 3 4 5 6 7 |
SELECT TOP (10) NationalIDNumber, JobTitle, BirthDate, HireDate FROM HumanResources.Employee ORDER BY BirthDate; GO |
Example 2: TOP Clause with a PERCENT value
We can use PERCENT in a TOP statement to retrieve N PERCENT of rows in the output. The following query returns 2 percent of rows in the output.
1 2 3 4 5 6 7 |
SELECT TOP (2) PERCENT NationalIDNumber, JobTitle, BirthDate, HireDate FROM HumanResources.Employee ORDER BY BirthDate; GO |
We have a total of 290 records in the [HumanResources.Employee].
1 |
SELECT COUNT(*) AS TotalRecords FROM HumanResources.Employee; |
Two percent of 290 is 5.8 that is a fractional value, so SQL Server rounds output to six rows (next whole number).
Example 3: SQL Server TOP Clause and WITH TIES clause
In the following query, we retrieve records from the SQL table order by [SickLeaveHours] column.
1 2 3 |
SELECT TOP (15) * FROM HumanResources.Employee ORDER BY SickLeaveHours; GO |
We get 15 rows in the output, as shown below.
Let’s execute the previous query by adding the WITH TIES clause.
1 2 3 |
SELECT TOP (15) WITH TIES * FROM HumanResources.Employee ORDER BY SickLeaveHours; GO |
We get 18 records in the output despite adding the TOP 15 clause. In the output, we have multiple records for SickLeaveHours = 22, therefore WITH TIES clause adds all records have the same values and we get 18 records now in the output.
Example 4: TOP Clause and SQL Delete statement
We can use TOP Clause in a SQL delete statement as well. We should use the ORDER BY clause in the delete statement to avoid unnecessary deletion of data.
1 2 3 4 5 6 |
DELETE FROM [dbo].[SalesData] WHERE CustomerId IN (SELECT TOP 10 CustomerId FROM [dbo].[SalesData] ORDER BY OrderDate ASC); GO |
In the above query, we want to retrieve the top 10 customers’ records in [orderdate] ascending order. Execute the query, and it deleted 13 records. We specified the TOP 10 clause, so it should not delete more than 10 records.
It might delete more records if the column specified in the TOP Clause contains a duplicate value. Therefore, we should use a delete statement with TOP carefully and use it with the primary key column only.
Example 5: TOP Clause and SQL Insert statement
The following query inserts the top 10 records from a select statement into [TempInsert] table. First, let’s create a SQL table using the CREATE TABLE statement with relevant data types.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
CREATE TABLE TempInsert ([FirstName] VARCHAR(50), [LastName] VARCHAR(50), [JobTitle] VARCHAR(150) ); INSERT INTO TempInsert OUTPUT inserted.[FirstName], inserted.[LastName], inserted.[JobTitle] SELECT TOP 10 [FirstName], [LastName], [JobTitle] FROM [AdventureWorks].[HumanResources].[vEmployee] ORDER BY BusinessEntityID DESC; |
We use the OUTPUT clause to display the records inserted in the TempInsert table. Executing the query gives the following result. We can use TOP Clause in insert statement as well, but it is recommended to use it in the Select statement. If we use it in the INSERT clause, it might cause performance issues, especially for a large table.
Example 6: SQL Server TOP Clause and SQL UPDATE statement
We can use TOP Clause in a SQL Update statement as well to restrict the number of rows for an update. Essentially, it is a combination of the select statement and update. In the following query, we use the TOP clause in the select statement and update the values in the [HumanResources.Employee] table.
1 2 3 4 5 6 7 8 9 10 11 |
UPDATE HumanResources.Employee SET SickLeaveHours = SickLeaveHours - 10 FROM ( SELECT TOP 10 BusinessEntityID FROM HumanResources.Employee ORDER BY HireDate ASC ) AS th WHERE HumanResources.Employee.BusinessEntityID = th.BusinessEntityID; GO |
Example 7: The TOP clause in a variable of a select statement
We can use a variable to define the number of records in the TOP statement. In the following query, we defined an int variable @i and specified value 10 for it. Later, we use this variable in the TOP clause.
1 2 3 4 5 6 7 |
DECLARE @i AS INT= 10; SELECT TOP (@i) [FirstName], [LastName], [JobTitle] FROM [AdventureWorks].[HumanResources].[vEmployee] ORDER BY BusinessEntityID DESC; GO |
Performance optimization for SQL Server TOP clause
For this section, let’s generate some test data and use the TOP clause in the select statement.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
CREATE TABLE TestTable (ID INT PRIMARY KEY CLUSTERED , ZIP INT, Remarks CHAR(3000) ); GO DECLARE @i INT; SET @i = 1; WHILE @i < 60000 BEGIN INSERT INTO TestTable VALUES (@i, RAND() * 200000, 'Sample Data' ); SET @i = @i + 1; END; GO |
Now, clear the buffer cache, enable the actual execution plan in SSMS and execute the following query.
1 2 3 4 5 |
DBCC FREEPROCCACHE SELECT TOP 100 * FROM TestTable WHERE ID < 30000 ORDER BY ZIP; |
Now, let’s run another select statement after clearing the buffer cache and view the execution plan.
1 2 3 4 5 |
DBCC FREEPROCCACHE; SELECT TOP 1001 * FROM TestTable WHERE ID < 30000 ORDER BY ZIP; |
The execution plan is the same; however, we see a warning in the sort operator. Sort operator also took 1.205s in comparison to 0.112s of the previous execution.
Once we hover the mouse over the sort operator, it shows that the sort operator used tempdb to spill data. It read and write 1977 pages from the tempdb spill.
In many cases, this sort operator and TempDB spill might cause performance issues, and we should eliminate it using proper indexes. We should never ignore the TempDB spill in the query execution plan. In the above image, we can see it caused the TempDB spill due to the ZIP column. It is the same column for which we want to sort the data.
For this example, let’s create a Non-clustered index on the ZIP column
1 |
CREATE NONCLUSTERED INDEX INX_NC_TestTable ON TestTable(ZIP); |
Once we have created the index, update the statistics with FULLSCAN so that query optimizer can use this newly created index.
1 |
UPDATE STATISTICS TestTable WITH FULLSCAN; |
Let’s rerun the select statement that caused the TempDB spill.
1 2 3 4 5 |
DBCC FREEPROCCACHE; SELECT TOP 1001 * FROM TestTable WHERE ID < 30000 ORDER BY ZIP; |
In the execution plan, we do not have a sort operator, and it uses a NonClustered scan along with clustered key lookup. It also eliminated the TempDB spill.
Conclusion
In this article, we get a glimpse of the SQL Server TOP clause using various examples. We also looked at the costly sort operator and its performance implications. You should always look at the workload, tune the query and create proper indexes.
- Understanding PostgreSQL SUBSTRING function - September 21, 2024
- How to install PostgreSQL on Ubuntu - July 13, 2023
- How to use the CROSSTAB function in PostgreSQL - February 17, 2023