This article discusses the Forwarded Records and its performance issues for heap tables in SQL Server.
Heap introduction and performance issues
A page of 8KB is the smallest unit of storage in SQL Server. In each page of SQL Server, we have a 96-bytes header to store the system information. SQL Server stores the data in two ways:
-
Clustered index
It stores data in a B+ tree structure according to the defined clustered index key. SQL Server stores the new or an update to an existing row in the correct logical position
-
Heap
A table without any clustered index is a heap. A heap table stores data without any logical order. It does not link pages together because we do not have any defined key on heap tables. We can create a non-clustered index on the heap table, but that contains a physical address for the underlying data records. It contains file number, the page number and slot number inside that page for the record
-
Page Free Space ( PFS) page
SQL Server scans the Page Free Space (PFS) page before inserting a new record into the heap table. If the data page contains enough free space, it stores the new row into the existing page. Alternatively, if it does not have sufficient free space, SQL Server allocates an extent (eight new data pages – 64 KB). PFS monitors each data page using two bits as specified below:
0x00
Empty data page
0x01
50 % full data page
0x02
51 to 80% full data page
0x03
81 to 95% full data page
0x04
96 to 100 % full data page
Forwarded Record issue simulation
Let’s say, we update existing data in a heap table. SQL Server cannot accommodate the new data into the existing page due to its larger data size. In this case, this data is stored in a different storage location, and SQL Server inserts a Forwarded Record in the original data location. SQL Server also maintains a pointer at the new data location pointing to the forwarded pointer so that it can keep track of the forwarding pointer chain in case of any data movement.
Let’s create a heap table and reproduce the forwarded record scenario:
Create a database and heap table Employees
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
CREATE DATABASE SQLShack; GO USE SQLShack; GO CREATE TABLE Employees ([EmpID] INT IDENTITY(1, 1), [Name] NVARCHAR(100), [BirthDate] DATETIME, [Salary] INT ); Insert sample records INSERT INTO Employees VALUES ('Rajendra', '1986-03-16', 50000 ); GO 2000 INSERT INTO Employees VALUES ('Sonu', '1980-11-29', 50000 ); GO 1000 |
Heap tables have index id zero, and you can identify heap tables using the sys.indexes system view. The following command shows the required output from the system view:
1 2 3 4 5 |
SELECT OBJECT_NAME(object_id) AS tablename, index_id, type_desc FROM sys.indexes WHERE object_id = OBJECT_ID('Employees'); |
We use dynamic management function (DMF) sys.dm_db_index_physical_stats for checking index fragmentation percentage, table index type, page counts and forwarded record counts. These details are available in the DETAILED mode of this DMV:
1 2 3 4 5 6 7 8 |
SELECT OBJECT_NAME(DIPS.object_id) as DBTableName, DIPS.index_type_desc, DIPS.avg_fragmentation_in_percent, DIPS.forwarded_record_count, DIPS.page_count FROM sys.dm_db_index_physical_stats (DB_ID(), DEFAULT, DEFAULT, DEFAULT, 'DETAILED') AS DIPS WHERE OBJECT_NAME(DIPS.object_id) = 'Employees' AND forwarded_record_count is NOT NULL |
In the above screenshot, we can see 17 pages for the heap table Employees and zero forwarded record counts.
Insert a few more records using the following query. It should increase the pages and index fragmentation:
1 2 3 4 5 6 7 |
INSERT INTO Employees VALUES ('Kusum', '1985-09-25', 60000 ); GO 2000 |
Re-execute the DMF query specified above and validate the page counts and forward record counts:
- Page Counts increases from 17 to 27
- Fragmentation increases from 33 to 50%
- Forwarded record counts still at Zero
Now, update the records in the Employee table for variable-length column [Name]:
1 2 3 4 |
UPDATE Employees SET [Name] = 'Rajendra Kumar Gupta' WHERE [Name] = 'Rajendra'; |
Now, again use the DMF and view the forward record counts:
- Page Counts increases from 27 to 35
- Forwarded record counts increase from zero to 747
Let’s perform a few more updates using the following query:
1 2 3 4 |
UPDATE Employees SET [Name] = 'Kusum Kashish Agarwal' WHERE [Name] = 'Kusum' |
In the following screenshot, note the things:
- Page Counts increases from 35 to 46
- Forwarded record counts increase from 747 to 1752
SQL Server could not accommodate the new updates in the existing pages and created the forward records that increase page counts and forwarded_record_counts.
Question: Do Forwarded Records cause any performance issues?
Yes, we can see a performance impact on the heap table due to these Forwarded Records. Execute the select statement that performs the search on the [Name] column. We enable the statistics IO to capture the required statistics:
1 2 3 4 |
SET STATISTICS IO ON; SELECT * FROM dbo.Employees WHERE name LIKE 'Rajendra%'; |
In the execution plan, we can see a table scan operator for the heap:
In the following image, we can see the SQL Server uses IAM pages to find the pages and extents that requires a scan. It analyzes the extent belonging to the heap table and processes them on their allocation order:
If the page allocation changes, it also changes the order in which pages need to be scanned. Here, it scans a page before page 2 because of its allocation:
In the message tab of previous query execution, it shows information about logical, physical scans. We can see 1798 logical read operations for retrieving the requested data:
In the case of a large heap table, we can see the considerable value of these logical reads that can be causing performance issues for the data retrieval, DML’s.
Forwarded Records issue fixes
Use fixed-length data type
Sometimes, we use heap tables for the staging tables and do not require clustered index on a heap table. The best way to fix these forwarded record issues and avoid so many logical reads is by using fixed-length data types. We should not use variable-length data types unless required.
Use a Clustered index
We should add a clustered index to a table because it sorts and stores data as per the clustered index key. It works for existing as well as new and updated data. Ideally, we should define a primary key on the table as it creates a clustered index key by default.
Monitor and rebuild heap table
If due to any specific requirement, we cannot use fixed-length data type or clustered index on a heap table, the best way is to monitor heap tables for Forwarded Records using the scripts provided earlier. We can use Alter Table..REBUILD command to rebuild a heap table. It is available from SQL Server 2008. It also updates the non-clustered index on the heap table.
Logical reads before table rebuild
Rebuild heap table
Execute the following command to rebuild Employee heap table:
1 |
ALTER TABLE Employees REBUILD; |
Verify the logical reads after rebuild
We can notice the following change:
- Logical Reads before heap rebuild: 1798
- Logical Reads after heap rebuild: 40
Alter table command completely rebuilds the heap and removes all Forwarded Records. We do not any Forwarded Records after rebuild. Let’s verify it:
- Page Counts reset to 40 from 46
- Forwarded record counts become zero from 1752 to zero
- Average fragmentation for heap also reduces to zero from the previous 33 percent
Conclusion
Database design is an essential criterion for the developers and database administrators for avoiding performance issues due to design. In this article, we explored the Forwarded Records issues with the heap table and logical reads due to these large forwarded record counts. It also wastes the database pages due to Forwarded Records and put IO performance issues due to the large tables. We should try to avoid heap tables and if required, take the necessary steps outlined above.
- Understanding PostgreSQL SUBSTRING function - September 21, 2024
- How to install PostgreSQL on Ubuntu - July 13, 2023
- How to use the CROSSTAB function in PostgreSQL - February 17, 2023