In this article, we will go through the different ways to set the Max Degree of Parallelism option. Before doing that, it is helpful to understand the concept of parallelism, how it works and what the MSXDOP is.
The main idea of the Parallelism is splitting the main task into smaller tasks, running simultaneously, with each task accomplishes part of the job, then assigning these tasks to a number of processors and finally combining the partial results into one final result. You can imagine the parallelism as running multiple serial plan tasks at the same time in a parallel plan, each task runs on a separate processing unit using its own server connection. In this way, a parallel will run much faster than the serial plan.
There are few technical terms that we should understand in order to be familiar with the parallelism. The first term is the Scheduler, which is the physical or logical processor that is responsible for scheduling execution of the SQL Server threads, taking into consideration that only one thread can be in the running state at any given time. The second term is the Worker, which represents the thread that is bounded to a specific schedule to accomplish a certain job. The fourth term is the Task, which represents a piece of work scheduled by SQL Server and assigned to a specific SQL Server worker. The Execution Context is where each task run inside, with each task runs inside only one execution context. The Parallel Page Supplier is a SQL Server Storage Engine component that is responsible for distributing the rows to be processed within the involved workers upon the workers’ demands and requests. The last component is the Exchange, that is responsible for connecting the different execution contexts of the parallel plan together.
Before executing a query, the SQL Server Database Engine checks if the parallel execution is required depending on many factors, such as, if the SQL Server instance is running on a server with more than one CPU, if the required number of threads is available and can be satisfied, the workload type, if the number of rows to be processed is sufficient to use parallel plan and if the query cost exceeds the Cost Threshold for Parallelism value. If it is required, the SQL Server Database Engine detects the number of threads and distribute the execution of the tasks among these threads. The number of workers that the SQL Server Database Engine assigns for each parallel plan execution to perform a specific task is called the Degree of Parallelism (DOP). Which also determines the number of processors that will be used to execute a parallel plan.
The number of processors that are used in the parallel plan execution can be limited using the Max Degree of parallelism option. Setting the Max Degree of Parallelism option to 0 will allow the SQL Server Engine to use all available CPUs on the current server up to 64 processors in the parallel plan execution. Only a Max Degree of Parallelism with a value equal to 1, serial plan execution will be used. You can override the Max Degree of Parallelism default value to specify the number of processors to be used in the parallel plan execution by setting it to any value between 2 and 32,767. If the value of the Max Degree of Parallelism is greater than the actual number of processors in the server, the actual number of processors that will be used in the parallel plan execution is all the available processors. If the SQL Server instance is hosted on a single processor server, the Max Degree of Parallelism value will be ignored.
There are many ways to set the Max Degree of Parallelism value. The first method is setting the value at the server level using SQL Server Management Studio. From the advanced page of the Server Properties, scroll down to the Parallelism section and override the default value of the Max Degree of Parallelism to the value that fits your workload as in the below:
The Max Degree of Parallelism value can also be overridden at the server level using the SP_Configure T-SQL command, after enabling the Show Advanced Option configuration. The below script is used to change the default Max Degree of Parallelism value and set it to 4:
1 2 3 4 5 6 7 8 9 10 |
EXEC sp_configure 'show advanced options', 1; GO RECONFIGURE WITH OVERRIDE; GO EXEC sp_configure 'max degree of parallelism', 4; GO RECONFIGURE WITH OVERRIDE; GO |
The third method that can be used to override the Max Degree of Parallelism value is using the OPTION (MAXDOP num) query hint, which will override the server Max Degree of Parallelism for that query with the value specified (num) in the query hint. The below query will not use parallel plan execution although the Max Degree of Parallelism value equal to 0 at the server level due to overriding the value to 1 at the query level suing OPTION (MAXDOP 1) hint:
1 2 3 4 5 6 |
USE SQLShackDemo GO SELECT TOP 10 * FROM Employees OPTION (MAXDOP 1) |
Right-click on the SELECT operator from the execution plan of the previous query and choose Properties, you will find that the Degree of Parallelism value is 0, which is the server’s level Max Degree of Parallelism value, but the parallel plan execution is not used due to MaxDopSetToOne reason:
The Max Degree of Parallelism can be overridden also at the database level in SQL Server 2014 SP2 and SQL Server 2016 versions from the newly defined Database Scoped Configurations. From the Advanced Options of the Database Properties window, the Max Degree of Parallelism value can be configured for a specific database by replacing the Max DOP value with the one that fits this database workload as in the below image:
Enabling the trace flag 8649 using the QUERYTRACEON query hint can be used to force the query to use the parallel plan execution if you find that the parallel plan execution is better for your query than the serial plan execution as in the following example:
1 2 3 4 5 6 7 8 9 10 |
USE SQLShackDemo GO SELECT ROW_NUMBER() over(partition by EmpID order by EmpID Desc) as RowID FROM Employees EMP JOIN Employee_Department EMPD ON EMP.EmpDEPID =EMPD.DepID WHERE EmpAddress not like '%AMM%' and EmpBirthDay >'1990-05-01' and EmpID>250 OPTION (QUERYTRACEON 8649) |
Another way that can be used to control the Max Degree of Parallelism is configuring a workload group while using the SQL Server Resource Governor Enterprise Edition feature by using the WITH MAX_DOP=value hint in the CREATE WORKLOAD GROUP group_name T-SQL statement as in the below statement:
1 2 3 4 5 |
CREATE WORKLOAD GROUP MaxDOPTest WITH (MAX_DOP = 4) USING [DEFAULT]; |
Conclusion
Parallelism is a fantastic SQL Server option that is used to take benefits of having more than one processor on the hosting server and split the current task into smaller tasks and distribute it among the available processors in order to complete it faster. Max Degree of Parallelism can be used to set the maximum number of processors that can be used to run the query in the parallel plan execution. There are many ways that can be used to set that value at the server, database, and query level.
- Azure Data Factory Interview Questions and Answers - February 11, 2021
- How to monitor Azure Data Factory - January 15, 2021
- Using Source Control in Azure Data Factory - January 12, 2021