In this article, we will learn the basics of Parallel Execution Plans, and we will also figure out how the query optimizer decides to generate a parallel query plan for the queries.
Let’s first look at how a query is executed and the role of the query optimizer in this process. When a query is executed, it proceeds through the following steps.
Query Parsing -> Binding (Algebrizer) -> Query Optimization -> Execution
The query optimizer makes an effort to produce an optimum query plan on the query optimization step. This step is very crucial because the output of the query optimizer directly affects the query performance. Query optimizer analyzes various query plan candidates and chooses the good enough plan from these candidates. Query optimizer makes a decision based on the balance between optimization time and plan quality when deciding on a query plan. Therefore, query optimizer does not work to find the best query plan; otherwise, this process might cause to consume more time, and the execution time will take longer.
When creating a query plan, the query optimizer is affected by various parameters. For example, the following parameters affect the generated query plans.
- Database compatibility level
- Cardinality Estimation version
- Query hints
In this context, it is clear that the goal of the query optimizer is to produce efficient execution plans that help to reduce the execution time of the queries. In accordance with this purpose, parallel query processing might be a good option to reduce the response time of the query by using the multiple CPU’s power.
What is parallel processing?
The parallel processing aims to separate big tasks into more than one small task, and these small tasks will be completed by the discrete threads. In this approach, more than one task will be performed in unit time; thus, the response time will be reduced dramatically. This idea does not change for SQL Server; it tries to process queries that require excessive workload in a parallel manner. The query optimizer in SQL Server takes into account three settings when generating a parallel query plan. These are:
- Cost Threshold for Parallelism
- Max Degree of Parallelism (MAXDOP)
- Affinity mask
Pre-requisites
In this article, we will use the AdventureWorks database, and we will also use the Create Enlarged AdventureWorks script to generate an enlarged copy of the SalesOrderHeader and SalesOrderDetail tables.
As a second step, we will create a non-clustered index through the following query.
1 2 |
CREATE NONCLUSTERED INDEX IX_L001 ON [Sales].[SalesOrderHeaderEnlarged] ([PurchaseOrderNumber]) |
Lastly, we will switch compatibly level to 150 (SQL Server 2019) so that our test database will be ready for all examples of this article.
1 |
ALTER DATABASE AdventureWorks2008R2 SET COMPATIBILITY_LEVEL = 150 |
Cost Threshold for Parallelism
The estimated query cost is a unit that is calculated using the I/O and CPU requirement of a query. This measurement helps the optimizer to evaluate the cost of the query plans and select the optimal plan. Query optimizer calculates the estimated cost of the query by summing the estimated cost of individual operators in the query plan. The Estimated Subtree Cost attribute indicates the estimated cost of the plan in the query plan.
When this value exceeds the Cost Threshold for Parallelism setting, the query optimizer begins to consider creating a parallel query plan alongside the serial plans. The default value of this setting is 5, and it can be changed using SQL Server Management Studio or Transact-SQL.
With the help of the following query, we can set the cost threshold for parallelism value as 20.
1 2 3 4 5 6 7 8 |
EXEC sp_configure 'show advanced options', 1 ; GO RECONFIGURE GO EXEC sp_configure 'cost threshold for parallelism', 20 ; GO RECONFIGURE GO |
Now we will execute the following query and analyze the actual execution plan.
1 2 3 4 5 6 7 |
SELECT ProductID,SUM(LineTotal) AS TotalsOfLine , SUM(UnitPrice) AS TotalsOfPrice, SUM(UnitPriceDiscount) AS TotalsOfDiscount FROM Sales.SalesOrderDetailEnlarged SOrderDet INNER JOIN Sales.SalesOrderHeaderEnlarged SalesOr ON SOrderDet.SalesOrderID = SalesOr.SalesOrderID WHERE PurchaseOrderNumber LIKE 'PO%' GROUP BY ProductID |
As we can see, the estimated subtree cost is nearly 37.12, and it is greater than the cost threshold for parallelism value so that the query optimizer generates a parallel query plan. The query optimizer considers generating a parallel query plan when the estimated subtree cost exceeds the cost threshold for parallelism value. Otherwise, the query optimizer will only evaluate the serial plans and will decide one of them. The Degree of Parallelism attribute indicates the number of processors that have been used by the query during the execution period.
However, the ENABLE_PARALLEL_PLAN_PREFERENCE query hint forces the query optimizer to generate a parallel query plan without thinking about the cost threshold for parallelism. For example, the following query estimated subtree cost is 17.12, and the query optimizer should decide on a serial query plan.
1 2 3 4 5 6 |
SELECT DISTINCT ProductID FROM Sales.SalesOrderDetailEnlarged SOrderDet INNER JOIN Sales.SalesOrderHeaderEnlarged SalesOr ON SOrderDet.SalesOrderID = SalesOr.SalesOrderID WHERE PurchaseOrderNumber LIKE 'PO%' GROUP BY ProductID |
If we add the ENABLE_PARALLEL_PLAN_PREFERENCE query hint at the end of the query, the query optimizer will decide to use a parallel execution plan.
1 2 3 4 5 6 7 |
SELECT DISTINCT ProductID FROM Sales.SalesOrderDetailEnlarged SOrderDet INNER JOIN Sales.SalesOrderHeaderEnlarged SalesOr ON SOrderDet.SalesOrderID = SalesOr.SalesOrderID WHERE PurchaseOrderNumber LIKE 'PO%' GROUP BY ProductID OPTION(USE HINT('ENABLE_PARALLEL_PLAN_PREFERENCE')) |
Max Degree of Parallelism
Max Degree of Parallelism, also known as MAXDOP, is a server, database, or query level option that determines the maximum number of logical processors that can be used when a query is executed. By default, this option is set to 0, and it means that the query engine can use all available processors. We can find out this value under the Advanced settings on the Server Properties page.
On the other hand, this value can be changed for a particular database and can be seen on the Options tab of the database.
The MAXDOP hint can be used to explicitly specify how many processors can be used by a query. For the below query, we will limit the maximum number of CPUs to be used to 2.
1 2 3 4 5 6 7 8 |
SELECT ProductID,SUM(LineTotal) AS TotalsOfLine , SUM(UnitPrice) AS TotalsOfPrice, SUM(UnitPriceDiscount) AS TotalsOfDiscount FROM Sales.SalesOrderDetailEnlarged SOrderDet INNER JOIN Sales.SalesOrderHeaderEnlarged SalesOr ON SOrderDet.SalesOrderID = SalesOr.SalesOrderID WHERE PurchaseOrderNumber LIKE 'PO%' GROUP BY ProductID OPTION (MAXDOP 2) |
In the above query plan, the degree of parallelism attribute shows how many processors have used when the above query was executed.
Also, this option can be used to force query optimizer to generate a serial query plan. In the following query, we will set the MAXDOP option as 1 and the query optimizer will create a serial query plan.
1 2 3 4 5 6 7 8 |
SELECT ProductID,SUM(LineTotal) AS TotalsOfLine , SUM(UnitPrice) AS TotalsOfPrice, SUM(UnitPriceDiscount) AS TotalsOfDiscount FROM Sales.SalesOrderDetailEnlarged SOrderDet INNER JOIN Sales.SalesOrderHeaderEnlarged SalesOr ON SOrderDet.SalesOrderID = SalesOr.SalesOrderID WHERE PurchaseOrderNumber LIKE 'PO%' GROUP BY ProductID OPTION (MAXDOP 1) |
As we can see, the NonParallelPlanReason attribute obviously shows why the query optimizer does not generate a parallel execution plan. For this example, the attribute value shows the MaxDOPSetToOne value because we have set the MAXDOP option as 1.
Affinity Mask
In SQL Server, this option helps to restrict specific CPU core usage. So that the SQL Server uses only the dedicated CPU cores. However, about this option, Microsoft warns us with the following note:
“This feature will be removed in a future version of Microsoft SQL Server. Do not use this feature in new development work, and modify applications that currently use this feature as soon as possible.”
To find this option, you can navigate to the Processors tab on the SQL Server properties page. In the following illustration, we will set the only one CPU core to SQL Server.
Now we will execute the following query and interpret the query plan.
1 2 3 4 5 6 7 |
SELECT ProductID,SUM(LineTotal) AS TotalsOfLine , SUM(UnitPrice) AS TotalsOfPrice, SUM(UnitPriceDiscount) AS TotalsOfDiscount FROM Sales.SalesOrderDetailEnlarged SOrderDet INNER JOIN Sales.SalesOrderHeaderEnlarged SalesOr ON SOrderDet.SalesOrderID = SalesOr.SalesOrderID WHERE PurchaseOrderNumber LIKE 'PO%' GROUP BY ProductID |
In this execution plan, we can see a new value for the NonParallelPlanReason attribute. EstimatedDOPIsOne specifies that only one CPU core is dedicated to the SQL Server; for this reason, it generates a serial query plan.
Tip: Inserting data into the variable does not allow us to generate parallel query plans. In the following query, we will declare a table variable and then insert some rows to it. Also, we will enable the actual query plan and will analyze it.
1 2 3 4 |
DECLARE @Temp AS TABLE (ID INT , TrackingNumber VARCHAR(100), LTotal FLOAT) INSERT INTO @Temp SELECT ProductID,CarrierTrackingNumber , LineTotal FROM Sales.SalesOrderDetailEnlarged WHERE CarrierTrackingNumber = '4911-403C-98' |
For this query, we figured out that the query optimizer has created a serial query plan because the NonParallelPlanReason attribute indicates the CouldNotGenerateValidParallelPlan value. On the other hand, temporary tables allow creating parallel query plans when we insert to rows. Now we will execute the same query for a temporary table and interpret the query plan.
1 2 3 4 5 6 |
DROP TABLE IF EXISTS #TempInsert CREATE TABLE #TempInsert (ID INT , TNumber VARCHAR(100), LTotal FLOAT) GO INSERT INTO #TempInsert SELECT ProductID,CarrierTrackingNumber , LineTotal FROM Sales.SalesOrderDetailEnlarged WHERE CarrierTrackingNumber = '4911-403C-98' |
As seen above, inserting data into the temporary table allows creating a parallel query plan. This is the main difference between temporary tables and table variables in terms of parallelism.
Conclusion
In this article, we learned the factors that help query optimizer to decide when to create Parallel Execution Plans. The query optimizer takes into account three options when generating parallel plans as we have stated them above:
- Cost Threshold for Parallelism
- Max Degree of Parallelism (MAXDOP)
- Affinity mask
- SQL Performance Tuning tips for newbies - April 15, 2024
- SQL Unit Testing reference guide for beginners - August 11, 2023
- SQL Cheat Sheet for Newbies - February 21, 2023