The purpose of this article is to provide insights into how parameter sniffing occurs for an ad-hoc query and how it affects their performance.
Case Study: Untouchable legacy code
Users are beginning to complain about the performance of the e-mail send module of the purchase application. This small piece of code finds the number of rows in the email table by taking different parameters. However, this code starts to run poorly in terms of performance as the email table total rows number gets to very large numbers. After the performance of the application is analyzed by the software development team, they realize a legacy subroutine code affects the application’s performance. This routine is only executing an ad-hoc query and this query causes an extremely long waiting time for some parameter values. After that, they write a simulation code to identify the root problem.
Finally, they diagnose the problem, the application waits for an unacceptable amount of time when a certain value is passed as a parameter to the query. Firstly, the team tries to overcome this problem on the application code. Unfortunately, the team set back making any changes to the source code of the application because they cannot guess its impacts on the application because of the insufficient documentation and then decided to find a solution on the SQL Server side.
In the next sections of this article, we will try to identify the cause of the problem in this case study. Besides this, we will discuss the pros and cons of some alternative solutions. However, firstly, we’ll take a look at some basic concepts before we get into the details.
Pre-Requirements
We’ll create a sample table to use the next parts of this article and populate it with some synthetic data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
CREATE TABLE EmailList( Id INT PRIMARY KEY IDENTITY(1,1) ,usermailname VARCHAR(100) , mailadress VARCHAR(100)) GO DECLARE @I AS INT =1 WHILE @I<= 5000000 BEGIN IF @I%2 = 0 OR @I%3 =0 OR @I%5 =0 OR @I%7=0 BEGIN INSERT INTO EmailList VALUES('defaultmail','defaultmail@mail.com') END ELSE BEGIN INSERT INTO EmailList VALUES(CONCAT(@I,'mail_adress'),CONCAT(@I,'mail_adress@mail.com')) END SET @I =@I+1 END CREATE INDEX IX_mailadress_001 ON EmailList (mailadress) |
What is an ad-hoc query?
Ad-hoc is a Latin origin word and it means “for this specific purpose” in English. An ad-hoc query is a single query that is not involved in any of the pre-defined queries (stored procedure, functions, view, etc.) and it is also non- parameterized queries. For example, the following query is an ad-hoc query:
1 2 3 |
SELECT * FROM Foo WHERE Bar >10 |
How is an ad-hoc query stored in the SQL Server query plan cache?
SQL Server can cache the query plan of an ad-hoc query. However, for a query to reuse a cached execution plan, its syntax must be absolutely the same as the cached query. Such as, adding spaces or any comments will be perceived by the optimizer as a new query and will cause a new query plan to be created. The main disadvantage of this working principle is when a database takes intensive ad-hoc query requests the excessive query compilations may cause a heavy workload on the database engine. The following image illustrates different cached query plans for the same query because of the space character.
Simple and Forced query parameterization
SQL Server query parameterization setting offers simple and forced parameterization options. We can find this setting in the database properties.
In the simple parameterization mode, the query optimizer may decide to parametrize some simple queries. To do this, the explicitly defined values are parameterized so that the newly generated query plan can be reused for the different values. If an ad-hoc query contains the following expressions the query does not parametrize in simple mode.
- JOIN
- IN
- BULK INSERT
- UNION
- INTO
- DISTINCT
- TOP
- GROUP BY
- HAVING
- COMPUTE
- CROSS/ OUTER APPLY
- Sub Queries
SQL Server will parameterize this simple query and this case can be seen in the query plan cache.
1 2 3 |
SELECT mailadress FROM EmailList WHERE mailadress = '13mail_adress@mail.com' |
1 2 3 4 5 6 7 8 |
SELECT plan_handle,UseCounts,RefCounts, Cacheobjtype, Objtype, TEXT AS SQL FROM sys.dm_exec_cached_plans CROSS APPLY sys.dm_exec_sql_text(plan_handle) where text like '%EmailList%' and text not like '%plan_handle%' and objtype = 'Prepared' |
When we run the same query with a different value, the plan created for the previous query will be used.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
SELECT mailadress FROM EmailList WHERE mailadress = '23mail_adress@mail.com' </p> <p> <img style="margin: 0px auto; display: block;" src="https://www.sqlshack.com/wp-content/uploads/2022/06/execution-plan-parameter-list.png" class="wp-image-79367" alt="Execution plan Parameter List" /> </p> <p> Under the forced parameterization mode, SQL Server will try to parameterize every ad-hoc query disregarding their simplicity and other expressions. </p><h2>Identify the case study problem: Ad-hoc query bad parameter sniffing</h2> <p> As we stated before, SQL Server is caching and reusing the compiled query plans, so that with this working principle reduces the query execution times and saves memory and CPU resources. The drawback of this mechanism occurs when the cached query plan does not show an effective performance for certain parameter values. Generally, this problem is experienced in the stored procedures, but we also face this problem in ad-hoc queries. At first, we'll run a query that filters out an infrequent value in the table. The execution plan of this query will be parameterized and cached into the plan cache in this manner. </p> <pre lang="tsql"> SELECT count(Id) FROM EmailList WHERE mailadress = '13mail_adress@mail.com'</strong> |
1 2 3 4 5 6 7 8 |
SELECT plan_handle,UseCounts,RefCounts, Cacheobjtype, Objtype, TEXT AS SQL FROM sys.dm_exec_cached_plans CROSS APPLY sys.dm_exec_sql_text(plan_handle) where text like '%EmailList%' and text not like '%plan_handle%' and objtype = 'Prepared' |
Now, we’ll execute a query that filters out a high-density value in the table.
1 2 3 |
SELECT COUNT(Id) FROM EmailList WHERE mailadress = 'defaultmail@mail.com' |
Query optimizer decided to use the cached query plan for this high-density value, but this cached query plan does not show an effective performance. When we use the OPTION (RECOMPILE) hint in a query, the optimizer rebuilds the execution plan for every execution of the query. We add this hint to our query and analyze the recompiled execution plan.
1 2 3 4 |
SELECT COUNT(Id) FROM EmailList WHERE mailadress = 'defaultmail@mail.com' OPTION (RECOMPILE) |
As we can see, the optimizer totally generates a different query plan for the same query. First, it decided on a parallel query plan and it also uses a batch mode on rowstore feature. Shortly, the problem occurs because the cached query plan performance is suboptimal for this value.
Using sp_create_plan_guide
The sp_create_plan_guide allows us to add some query hints into the queries without explicitly interfering with the query syntax. OPTION (RECOMPILE) hint can be a solution to overcome the parameter sniffing problems so if we inject this hint into the query we can eliminate the problem for this case study. The syntax of the query will be converted into the following command by the System.Data.SqlClient class before sending the SQL Server.
1 2 3 |
exec sp_executesql N'SELECT COUNT(Id) FROM [EmailList] WHERE mailadress=@MailAdrr',N'@MailAdrr varchar(200)',@MailAdrr='13mail_adress@mail.com'</strong> |
We can add OPTION(RECOMPILE) hint to this query without changing any code of the application through the sp_create_plan_guide procedure as like the below.
1 2 3 4 5 6 7 8 9 |
EXEC sp_create_plan_guide @name = N'FixPurchaseAppAdHoc', @stmt = N'SELECT COUNT(Id) FROM [EmailList] WHERE mailadress=@MailAdrr', @type = N'SQL', @module_or_batch = NULL, @params = N'@MailAdrr varchar(200)', @hints = N'OPTION (RECOMPILE)' |
We can display the created plan guides with help of the following query.
1 2 3 |
SELECT * FROM sys.plan_guides |
After the creation of the plan guide, the queries which are sent from the purchase application mail module will be recompiled. This state of affairs can be monitored through the sql_statement_recompile event.
As we can see, the optimizer recompiles the queries without considering the value of all parameters so that we obtain fresh query plans for every execution of the query. As a result, with this change, the performance of the application fell below the limit value. To drop the created plan guide:
1 2 3 4 |
EXEC sp_control_plan_guide N'DROP', N'FixPurchaseAppAdHoc' GO |
Disabling Parameter Sniffing
SQL Server offers a Parameter Sniffing setting at the database level. When we disable this setting, the optimizer considers the average data distribution while compiling a query. This setting can be found under the Database Scoped Configurations.
After disabling the Parameter Sniffing option optimizer will calculate the estimated number of rows using the density vector rather than the statistics histogram.
1 2 3 |
exec sp_executesql N'SELECT COUNT(Id) FROM [EmailList] WHERE mailadress=@MailAdrr',N'@MailAdrr varchar(200)',@MailAdrr='defaultmail@mail.com' |
We can obtain the estimated number of rows to multiple the total number of rows with the density.
1 2 3 |
DBCC SHOW_STATISTICS(EmailList,'IX_mailadress_001') |
Tip: We can use Google for complicated mathematical calculations.
Disabling parameter sniffing is not an effective method to get rid of the bad parameter sniffing and it may cause performance problems on the other ad-hoc queries and stored procedures. However, in some special scenarios, it can resolve the parameter sniffing problems.
Conclusion
In this article, we discussed a case study about the ad-hoc query parameter sniffing problems. It is very difficult to resolve parameter sniffing issues without altering the query structures, but for this case, we were lucky and found a solution method.
- SQL Performance Tuning tips for newbies - April 15, 2024
- SQL Unit Testing reference guide for beginners - August 11, 2023
- SQL Cheat Sheet for Newbies - February 21, 2023