This article explains the internals of the Ghost cleanup task for SQL Server Databases.
Introduction
Each relational database consists of the background process and user processes.Similarly, SQL Server has many internal processes such as Checkpoint, Lazy Writer, Lock Monitor, System_Health_Monitor, Trace Queue Task and Ghost CleanUp. Broadly, we can divide these system processes into the following categories.
-
Data Writer Processes
- Lazy writer
- CHECKPOINT
- EAGER WRITER
-
Logging Transaction Processes
- LOG WRITER
- BACKUP LOG
-
System processes
- RESOURCE MONITOR
- LOCK MONITOR
- GHOST CLEANUP
- TRACE QUEUE TASK
-
Job Scheduling processes
- SQL Server Agent
I have seen DBAs have an idea about Checkpoint, Lazy Writer process, but they ignore the importance of the Ghost Cleanup background process. To understand some of these background processes, refer to the article – SQL Server CHECKPOINT, Lazy Writer, Eager Writer and Dirty Pages in SQL Server
Have you seen the GHOST CLEANUP task while checking the processes using sp_who2 or dynamic management view sys.dm_exec_requests? You can identify a background process using the status column output as Background.
This article walks through with an insight on the ghost cleanup task in the SQL Server Database.
Pre-requisites
It would help if you had a basic understanding of SQL Server indexes – Clustered and nonclustered index to understand this article’s ghost cleanup task.
- Note: You can always refer to Indexes for understanding index requirements, architecture, usage
What is a Ghost Cleanup Task for SQL Server Database?
Ghost cleanup is a background process that periodically runs and cleans up ghost records. The next question is – What is a ghost record?
When users delete records from the clustered index data page or nonclustered index leaf page, SQL Server does not physically remove them. Instead, it marks those records as Ghost Records. Therefore, the deleted row stays on the page; however, the row header indicates that the row is a ghost row.
It is a performance optimization technique so that the delete operation performs logical deletion and finishes quickly. Similarly, during a rollback, SQL Server has to unmark the record as a ghost record. Thus, it eliminates the requirement of reinserting those deleted records for a rollback.
Once the delete transaction commits and the ghost cleanup task occurs, it checks the ghost records on a page and removes them physically. SQL Server also updates the PFS (Page Free Space) page header to mark it as a ghost record page. It also marks the database as having ghosted entries.
The ghost cleanup task performs the following work:
- The ghost cleanup task occurs every 5 seconds (SQL Server 2012+)
- It starts checking databases marked as having ghost entries and scans the PFS pages of the ghost database
- It removes the ghost record rows physically from the database
- Once it removes all ghost records from a database, it marks it with no ghost entries. Therefore, SQL Server skips the database during the next run of the ghost cleanup task
- A single run of the GHOST CLEANUP task cleans up a maximum of 10 pages in each execution. It is to ensure that it does not impact the system’s performance. If there are multiple pages ( more than 10), it processes them during the subsequent execution of the ghost cleanup task
- If the cleanup process could not take a shared lock on the database having ghost records, it skips it and checks again during the next run
Important Notes:
Usually, we do not see the GHOST CLEANUP task if you query sp_who2 or DMV because it is a quick process. However, for an extensive database with multiple frequent deletes, you can observe this in the session’s output.
In this article, to capture the ghost cleanup task, we will create a table [GhostSession] and insert records into table using a WHILE loop until the GHOST CLEANUP background processis captured.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
CREATE TABLE [dbo].[GhostSession] ( [session_id] [smallint] NOT NULL, [start_time] [datetime] NOT NULL, [status] [nvarchar](30) NOT NULL, [command] [nvarchar](32) NOT NULL, [database_id] [smallint] NOT NULL ) ON [PRIMARY] GO SET NOCOUNT ON; GO DECLARE @i INT SELECT @i = 0; WHILE (@i < 1) BEGIN INSERT INTO [GhostSession]( [session_id],[start_time],[status], [command],[database_id]) SELECT [session_id],[start_time],[status], [command],[database_id] FROM sys.dm_exec_requests WHERE command LIKE '%ghost%' SELECT @i = COUNT (*) FROM [GhostSession] END; GO Select * from [GhostSession] |
As shown below, SQL Server triggered the ghost cleanup task for database id 5.
To find out ghost records in a database, we can use the DMV sys.dm_db_index_physical_stats as per the below script. This script groups records on the database id and orders them in descending order of total ghost records.
- Note: The DMV returns size,fragmentation information and ghost records information of the specified table or DB. You can refer Microsoft docs for understanding the use cases of the DMV sys.dm_db_index_physical_stats
1 2 3 4 |
SELECT sum(ghost_record_count) total_ghost_records, db_name(database_id) as DB FROM sys.dm_db_index_physical_stats (NULL, NULL, NULL, NULL, 'SAMPLED') group by database_id order by total_ghost_records desc |
As shown below, the [AdventureWorks2017] database has two ghost records while running the DMV. We do not have any ghost records for remaining databases.
Practical demonstration of ghost records for a SQL Server database
Until you have got understanding of the ghost cleanup task for SQL Server. This section will look at internal SQL Server pages to identify the ghost records and cleanup process. For this purpose, let’s create a SQL Server database and insert sample records in it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
Create Database TestDatabase Go Use TestDatabase Go CREATE TABLE [dbo].[ViewGhostRecords] ( [ID] Int PRIMARY KEY CLUSTERED, [Name] varchar(100) NULL, [Amount] int NULL ) ON [PRIMARY] CREATE NONCLUSTERED INDEX [IX_Amount] ON [dbo].[ViewGhostRecords] ( [Amount] ) declare @i int Set @i=0 While(@i<1000) BEGIN insert into [dbo].[ViewGhostRecords] values (@i, 'Ghost record demo',100) set @i= @i+1 END |
Before we proceed further, we enable the following two trace flags:
- Trace flag 3604 to view DBCC results in SSMS
- Trace flag 661 to disable the GHOST CLEANUP task
- Note: Do not enable trace flags in the production environment unless you have a specific requirement
1 2 3 4 |
DBCC traceon (3604,-1) go DBCC traceon (661,-1) go |
- Note: Do not disable GHOST CLEANUP in a production instance. If required, have a controlled environment testing before doing it for production
- Note: The execution time of these stored procedures depends on the size of the database, data file, available free space, disk capacity. You should run them on off business hours to avoid any performance impact
- Understanding PostgreSQL SUBSTRING function - September 21, 2024
- How to install PostgreSQL on Ubuntu - July 13, 2023
- How to use the CROSSTAB function in PostgreSQL - February 17, 2023
Further, we remove records from the demo table and view page data using the DBCC PAGE. You can also use DMV sys.dm_db_page_info in SQL Server 2019.
1 2 3 4 5 |
BEGIN TRAN ABCD Delete from [dbo].[ViewGhostRecords] go DBCC PAGE(7,1,1,3) WITH TABLERESULTS go |
The sample output shows entries on the PFS page that has a message:
You can query index DMV to check the number of ghost records. As shown below, it has 2000 ghost records as of now.
We can also use the fn_dblog() function to check the content of the transaction log.
1 2 3 4 5 6 7 8 |
DECLARE @TranID CHAR (20) SELECT @TranID = [Transaction ID] FROM fn_dblog (null, null) WHERE [Transaction Name]='ABCD' SELECT * FROM fn_dblog (null, null) WHERE [Transaction ID] = @TranID; GO |
You see the operation as LOP_DELETE_ROWS and Context as LCX_MARK_AS_GHOST for the deleted rows in the sample table in the output.
Disable the ghost cleanup for the SQL Server database
Disabling the ghost cleanup process can increase the buffer pool size, raise IO requests. It might increase database size because SQL Server will not reuse the space that is cleared by deleting the rows. SQL Server also has to add new pages for accommodating new inserts that increase page splits and ultimately performance issues
As we saw earlier, you can use trace flag 661 for disabling the ghost cleanup. However, it is not recommended to do so until you have a specific requirement and it is tested well enough for a production release.
However, if you disable the ghost cleanup task, you can manually delete the ghost records using the following stored procedures.
Use of stored procedure sp_clean_db_free_space in SQL Server Database
This procedure cleans the ghost records on the pages of a database. It requires two arguments –database name (mandatory) and cleaning delay time in seconds (optional). The cleanup delay is an interval that you can specify to delay between cleaning of the pages. The default value is 0.
The syntax of the sp_clean_db_free_space procedure:
1 2 3 |
sp_clean_db_free_space [ @dbname = ] 'database_name' [ , [ @cleaning_delay = ] 'delay_in_seconds' ] [;] |
Stored procedure sp_clean_db_file_free_space in SQL Server Database
This procedure can clean up ghost records in a specific data file for a database. As shown in the syntax, it requires an additional data file id (mandatory) argument.
The syntax of the sp_clean_db_file_free_space procedure:
1 2 3 4 |
sp_clean_db_file_free_space [ @dbname = ] 'database_name' , [ @fileid = ] 'file_number' [ , [ @cleaning_delay = ] 'delay_in_seconds' ] [;] |
Conclusion
This article explored the ghost cleanup task in SQL Server Database for removing the ghost records. You should get an understanding of this helpful background process. If you disable it for a specific reason, use the stored procedure to clean them manually. The article’s purpose is to give an understanding of the ghost cleanup task. Therefore, I again reinstate that never disable ghost cleanup task in SQL Server unless you have a particular reason to do so.