In this article, we will focus on one major SQL Server performance issue that we may experience during heavy data insertion in a table.
Introduction
SQL Server allocates some memory from the operating system, and this is because it is designed to do all read and modification operations in memory instead of on disk. The main purpose of this functionality is to reduce physical I/O operations and also reduce the response time of the operations. There is no doubt that this architecture improves SQL Server’s overall performance positively. This reserved memory area is called the buffer pool or buffer cache. A latch is an internal SQL Server mechanism and it is responsible to coordinate and also protect the integrity of the index and data pages in the buffer pool. However, in a heavy workload, we can experience some performance issues related to this internal mechanism.
What is the PAGELATCH?
PAGELATCH is a thread synchronization mechanism and it works as a conductor to synchronize the access to index or data pages that are located in the buffer pool. The goal of this synchronization mechanism is to ensure the consistency of data or index pages in the buffer pool. We can say that surely, the PAGELATCH wait type can be seen on every busy SQL Server and does not affect the SQL Server performance negatively.
The last page insert contention
Primary keys constraints uniquely identify each row in the table and automatically creates a clustered index on the underlining table. This duo is frequently used in table design by database developers. At the same time, if this column is decorated with an identity constraint thus we obtain a sequential incremental index key column. The clustered index creates a sorted data structure of the table for this reason a newly inserted row will be added at the end of the clustered index page until that page is filled. When solely one thread adds data to the above-mentioned table, we will never experience a last page insert contention because this problem will occur with concurrent usage of this table. In the high-volume insert operations, the last page of the index is not accessed by all threads concurrently. All threads start waiting for the last page to be accessible to them because the last page is locked by a thread. This bottleneck affects the SQL Server performance and the PAGELATCH_EX wait type begins to be observed too much.
Now we will demonstrate this issue with a very simple example. At first, we create a sample table through the following query.
1 2 3 4 |
CREATE TABLE InsertTestTable( Id INT PRIMARY KEY IDENTITY(1,1) NOT NULL , Col VARCHAR(50) NOT NULL) GO |
As we can see, the Id column is specified as a clustered index, and also an identity constraint enables the auto-increment feature for this column. In order to generate a heavy insert workload on this table, we will use the SQLQueryStress tool. With help of the SQLQueryStrees, we generate an intensive insert workload on the table
1 2 3 4 5 6 7 |
SET NOCOUNT ON; DECLARE @i int = 1 WHILE @i < 1000 BEGIN INSERT INTO dbo.InsertTestTable (Col) VALUES ('Test Value') SET @i += 1 END; |
After launching the SQLQueryStrees, we will set the database credentials and click the OK button.
As the second step, we will set the Number of Iterations as 100 and the Number of Threads as 100. So that, SQLQueryStress the query will run 10 times by the 50 threads concurrently.
After starting the SQLQueryStress tool, we will monitor the system to identify the resource bottlenecks.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
SELECT dm_ws.wait_duration_ms, dm_ws.wait_type, dm_r.status, dm_r.wait_resource, dm_t.TEXT, dm_qp.query_plan, dm_ws.session_ID, dm_es.cpu_time, dm_es.memory_usage, dm_es.logical_reads, dm_es.total_elapsed_time, dm_es.program_name, DB_NAME(dm_r.database_id) DatabaseName FROM sys.dm_os_waiting_tasks dm_ws INNER JOIN sys.dm_exec_requests dm_r ON dm_ws.session_id = dm_r.session_id INNER JOIN sys.dm_exec_sessions dm_es ON dm_es.session_id = dm_r.session_id CROSS APPLY sys.dm_exec_sql_text (dm_r.sql_handle) dm_t CROSS APPLY sys.dm_exec_query_plan (dm_r.plan_handle) dm_qp WHERE dm_es.is_user_process = 1 |
The above image illustrates that the session status is suspended for various rows and their wait types are PAGELATCH_EX. This situation means that they are waiting for a particular page that is not accessible at that time.
The wait resource gives some number combinations and their explanations :
11: Database Id
1: File Id
14339: Page Id
We can look inside this page through the DBCC PAGE command.
1 2 3 4 |
DBCC TRACEON(3604) GO DBCC PAGE(11,1,14339,0) GO |
Under the PAGE HEADER section of the report, we can find out all details about this page.
m_type =1 indicates that this page store data records in a heap or clustered index leaf-level
IndexId = 1 indicates this page belongs to the index which is index id is equal to 1
ObjectId = 581577110 specifies the object id of the database object.
We can find out this table and index name very quickly with help of the following query.
1 2 3 4 5 6 7 8 9 10 |
SELECT name AS Index_Name, type_desc As Index_Type, is_unique, OBJECT_NAME(object_id) As Table_Name FROM sys.indexes WHERE is_hypothetical = 0 AND index_id != 0 and object_id=581577110 |
As you can see, the problematic index belongs to our example table. Now let’s examine how we can overcome this problem. Meanwhile, our test was completed in 58 seconds.
Enable OPTIMIZE_FOR_SEQUENTIAL_KEY option
OPTIMIZE_FOR_SEQUENTIAL_KEY feature has been introduced with SQL Server 2019 and it limits the number of threads allowed to request the latch to one per scheduler so it will reduce time spent in the runnable queue once the latch is acquired. To enable this feature, we will execute the following query:
1 2 |
ALTER INDEX PK__InsertTe__3214EC074999CCCE ON dbo.InsertTestTable SET(OPTIMIZE_FOR_SEQUENTIAL_KEY = ON); |
After enabling the OPTIMIZE_FOR_SEQUENTIAL_KEY feature, we will run SQLQueryStress with the same query and with same settings.
The PAGELATCH_EX wait type gives its place to BTREE_INSERT_FLOW_CONTROL wait type.
Despite this wait type, we gained some SQL Server performance improvements. The chart below shows the comparison of when the OPTIMIZE_FOR_SEQUENTIAL_KEY property is enabled and when it is disabled.
Define Non-Clustered index on the Primary Key column
The main difference between the clustered index and non-clustered index in SQL Server is that the clustered index creates a sorted structure of the table. On the other hand, the heap tables do not contain any clustered index therefore the insert operations are not enforcing a strict order, so the insert operation is faster. To resolve the last page insert problem and reduce the PAGELATCH_EX wait type we can take advantage of this design approach so it improves SQL Server performance. At first, we will drop the clustered index and primary key on the InsertTestTable.
1 2 |
ALTER TABLE InsertTestTable DROP CONSTRAINT PK__InsertTe__3214EC074999CCCE |
As a second step, we will create a primary key with a non-clustered index.
1 2 3 |
ALTER TABLE InsertTestTable ADD CONSTRAINT PrimaryKey_Id PRIMARY KEY NONCLUSTERED (ID) |
After creating the non-clustered index, we will run the SQLQueryStress with the same parameters and for the same query.
The test was completed in 57 seconds and does not much affect the SQL Server performance of the insert operation.
Use In-Memory Optimized Tables
We can use the in-memory optimized tables instead of the disk-based tables. In the context of the SQL Server performance, memory-optimized tables can resolve the latches and lock issues. To create a memory-optimized table, we need to apply the following settings :
-
We add a memory-optimized file group to hold the memory-optimized data in the database
12ALTER DATABASE SQLShackDemoADD FILEGROUP SQLShackDemo_mod CONTAINS MEMORY_OPTIMIZED_DATA; -
We add a new file into the memory-optimized filegroup
1234ALTER DATABASE SQLShackDemoADD FILE (name='SQLShackDemo_mod1',FILENAME='C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\MSSQL\DATA\SQLShackDemo_mod1.ndf')TO FILEGROUP SQLShackDemo_mod
After creating the required filegroup and files on the database, we will create durable (SCHEMA_AND_DATA) memory-optimized tables. SQL Server offers two types of memory-optimized table SCHEMA_ONLY and SCHEMA_AND_DATA.
SCHEMA_ONLY
In this table type, the data is lost when the SQL Server is restarted but the schema is always persisted. The SCHEMA_ONLY tables are so fast because they don’t generate any physical I/O.
SCHEMA_AND_DATA
In this table type, the data and schema are not lost as the disk-based tables.
The following query will create a memory-optimized table and also this table includes a non-clustered index and primary key. The memory-optimized tables do not support clustered indexes.
1 2 3 4 |
CREATE TABLE InMemoryInsertTestTable (Id INTEGER NOT NULL IDENTITY(1, 1) PRIMARY KEY NONCLUSTERED, Col VARCHAR(50) NOT NULL ) WITH(MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA); |
We will start the same test for this table.
As we can see, using the memory-optimized table dramatically improves the SQL Server performance for insert operation and we don’t monitor any PAGELATCH_EX wait type.
Conclusion
In this article, we have analyzed the last page insert issue and also have learned how it occurs. The last page insert issue affects SQL Server performance negatively on the intensive insert operation. To resolve this type of problem, we can apply the following methods :
- Enable OPTIMIZE_FOR_SEQUENTIAL_KEY option
- Consider using memory-optimized tables
- Define Non-Clustered index on the Primary Key column
- SQL Performance Tuning tips for newbies - April 15, 2024
- SQL Unit Testing reference guide for beginners - August 11, 2023
- SQL Cheat Sheet for Newbies - February 21, 2023