This article gives a walk-through of SQL Server Statistics and different methods to perform SQL Server Update Statistics.
Introduction
SQL Server statistics are essential for the query optimizer to prepare an optimized and cost-effective execution plan. These statistics provide distribution of column values to the query optimizer, and it helps SQL Server to estimate the number of rows (also known as cardinality). The query optimizer should be updated regularly. Improper statistics might mislead query optimizer to choose costly operators such as index scan over index seek and it might cause high CPU, memory and IO issues in SQL Server. We might also face blocking, deadlocks that eventually causes trouble to the underlying queries, resources.
Options to view SQL Server statistics
We can view SQL Server statistics for an existing object using both SSMS and the T-SQL method.
SSMS to view SQL Server Statistics
Connect to a SQL Server instance in SSMS and expand the particular database. Expand the object ( for example, HumanResources.Employee), and we can view all available statistics under the STATISTICS tab.
We can get details about any particular statistics as well. Right-click on the statistics and go to properties.
It opens the statistics properties and shows the statistics columns and last update date for the particular statistics.
Click on the Details, and it shows the distribution of values and the frequency of each distinct value occurrence (histogram) for the specified object.
T-SQL to view SQL Server Statistics
We can use DMV sys.dm_db_stats_properties to view the properties of statistics for a specified object in the current database.
Execute the following query to check the statistics for HumanResources.Employee table.
1 2 3 4 5 6 7 8 9 10 11 12 |
SELECT sp.stats_id, name, filter_definition, last_updated, rows, rows_sampled, steps, unfiltered_rows, modification_counter FROM sys.stats AS stat CROSS APPLY sys.dm_db_stats_properties(stat.object_id, stat.stats_id) AS sp WHERE stat.object_id = OBJECT_ID('HumanResources.Employee'); |
- Stats_ID: It is the unique ID of the statistics object
- Name: It is the statistics name
- Last_updated: It is the date and time of the last statistics update
- Rows: It shows the total number of rows at the time of the last statistics update
- Rows_sampled: It gives the total number of sample rows for the statistics
- Unfiltered_rows: In the screenshot, you can see both rows_sampled and unfiltered_rows value the same because we did not use any filter in the statistics
- Modification_counter: It is a vital column to look. We get the total number of modifications since the last statistics update
The different methods to perform SQL Server update Statistics
SQL Server provides different methods at the database level to update SQL Server Statistics.
Right-click on the database and go to properties. In the database properties, we can view statistics options under the Automatic tab.
Auto Create Statistics
SQL Server automatically creates statistics on the individual columns for the query predicate to improve the cardinality estimate and prepare a cost-effective execution plan.
-
Auto Create Statistics name starts with _WA
Use the following query to identify statistics auto-created by SQL Server.
12345678910111213SELECT sp.stats_id,name,filter_definition,last_updated,rows,rows_sampled,steps,unfiltered_rows,modification_counterFROM sys.stats AS statCROSS APPLY sys.dm_db_stats_properties(stat.object_id, stat.stats_id) AS spWHERE stat.object_id = OBJECT_ID('HumanResources.Employee')and name like '_WA%'; -
Auto-created SQL Server statistics are single column statistics
- SQL Server creates auto statistics for the columns that do not have a histogram for the existing statistics object
Auto Create Incremental Statistics
Starting from SQL Server 2014, we have a new option Auto-Create Incremental Statistics. SQL Server requires scanning the entire table for SQL Server update statistics, and it causes issues for the large tables. It is also valid for a table with partitioning as well. We can use this feature to update only the partition that we require to update. By default, Auto Create Incremental is disabled for individual databases. You can go through Introducing SQL Server Incremental Statistics for Partitioned Tables to get more knowledge on incremental statistics.
Auto Update Statistics
We regularly perform DML operations such as insert, update, delete and such operations change the data distribution or histogram value. Due to these operations, statistics might be out of date, and it might cause issues with the query optimizer efficiency. By default, the SQL Server database has an option Auto Update Statistics true.
With this Auto Update Statistics option, query optimizer updates the SQL Server update statistics when the statistics are out of date. SQL Server uses the following method to update statistics automatically.
SQL Server 2014 or before | |
Number of rows at the time of statistics creation | Auto Update Statistics |
<=500 | Update statistics for every 500 modifications |
>500 | Update statistics for every 500 + 20 percent modifications |
For the large tables, we require to update 20% of a row to auto-update the statistics. For example, a table with 1 million rows requires 20,000 rows updates. It might not be suitable for the query optimizer to generate an efficient execution plan. SQL Server 2016 onwards, it uses dynamic statistics update threshold, and it adjusts automatically according to the number of rows in the table.
Threshold = √((1000)*Current table cardinality)
For example, a table with one million rows we can use the formula to calculate the number of updates after which SQL Server will automatically update statistics.
Threshold = √(1000*1000000) = 31622
SQL Server updates the statistics after the approx. 31622 modifications in the object.
- Note: the database compatibility level should be 130 or above to use this dynamic threshold statistics calculations.
Auto Update Statistics Asynchronously
SQL Server uses synchronous mode to update the statistics. If query optimizer finds out of date statistics, it updates the SQL Server Statistics first and then prepares the execution plan according to the recently updated statistics.
If we enable the Auto Update Statistics Asynchronously, SQL Server does not wait to update the statistics. Instead, it executes the query with existing statistics and initiates update statistics requests’ in parallel. The next executed query takes the benefit of the updated statistics. Since SQL Server does not wait for the updated statistics, we also call it Asynchronous mode statistics.
Manually Update Statistics
In the previous section, we learned that SQL Server automatically updates the out-of-date statistics. We can also manually update the statistics for improving the query execution plan and performance on a requirement basis. We can use the UPDATE STATISTICS or Sp_Update stored procedure to update SQL Server statistics.
Let’s use the UPDATE STATISTICS command and its various options to update SQL Server statistics.
Example 1: SQL Server UPDATE STATISTICS for all statistics in an object
Execute the following query to update SQL Server Statistics on HumanResources.Employee table.
1 |
Update STATISTICS HumanResources.Employee |
In the following screenshot, we can verify that all the stats get an update at the same time.
Example 2: SQL Server UPDATE STATISTICS for specific statistics
Let’s say we want to update SQL Server statistics for statistics IX_Employee_OrganizationNode. Execute the following code.
1 |
Update STATISTICS HumanResources.Employee IX_Employee_OrganizationNode |
It updates only specific statistics. In the following screenshot, we can verify this.
Example 3: SQL Server UPDATE STATISTICS with FULL Scan
We use FULL SCAN in the UPDATE STATISTICS to scan all rows of a table. In the previous examples, we did not specify the FULL SCAN parameter. Therefore, SQL Server automatically decides whether it requires FULL SCAN or not.
The following query does a full scan and updates the statistics for specific statistics in the specified object.
1 |
Update STATISTICS HumanResources.Employee IX_Employee_OrganizationNode WITH FULLSCAN |
We can also use the WITH SAMPLE 100 PERCENT clause instead of WITH FULLSCAN and both returns the same result.
1 |
Update STATISTICS HumanResources.Employee IX_Employee_OrganizationNode WITH SAMPLE 100 PERCENT |
Example 4: UPDATE STATISTICS with SAMPLE
We can use WITH SAMPLE CLAUSE to specify the percentage or number of rows for the query optimizer to update statistics.
The following query specifies a 10 percent sample to update the statistics.
1 |
Update STATISTICS HumanResources.Employee IX_Employee_OrganizationNode WITH SAMPLE 10 PERCENT |
The following query specifies 1000 rows sample to update the statistics.
1 |
Update STATISTICS HumanResources.Employee IX_Employee_OrganizationNode WITH SAMPLE 1000 ROWS |
-
Note:
- We should not specify 0 PERCENT or 0 Rows to update the statistics because it just updates the statistics object, but it does not contain statistics data
- We cannot use FULL SCAN and SAMPLE together
- We should use SAMPLE under specific requirements only. We might take less sample size, and it might not be suitable for the query optimizer to choose the appropriate plan
- We should not disable the auto-update statistics even we are regularly updating the SQL Server Statistics. Auto Update Statistics allows SQL Server to automatically update stats according to the predefined threshold
- Updating statistics with FULL SCAN might take longer for an extensive data set object. We should plan it and do it off business hours
We usually perform database maintenance such as index rebuild or index reorganize. SQL Server automatically updates the statistics after the index rebuild. It is equivalent to update statistics with FULL SCAN however; it does not update the column statistics. We should update column statistics after index rebuild as well. We can use the following queries to do the task for all statistics on a specified object.
1 |
Update STATISTICS HumanResources.Employee WITH FULLSCAN, COLUMNS |
SQL Server does not update statistics with the index reorganize activity. We should manually update the statistics, if required or need to rely on the automatically updated statistics.
Updating All Statistics with sp_updatestats
We can update use sp_updatestats to update all statistics in the database. It does through each object statistics and performs the required update. For the large databases, it might take unnecessary longer time and system resources as well because it performs a check on each statistic on the object.
If the update is not required, we get the following message.
0 index(es)/statistic(s) have been updated, 1 did not require update.
If it updates the statistics, we get the following message.
1 index(es)/statistic(s) have been updated.
1 |
exec sp_updatestats |
Update STATISTICS using SQL Server Maintenance Plan
We can configure a SQL Server maintenance plan to update the statistics regularly. Connect to SQL Server instance in SSMS. Right-click on the Maintenance Plans and go to Maintenance Plan Wizard.
Select the Update Statistics maintenance task from the list of tasks.
Click Next, and you can define the Update Statistics task.
In this page, we can select the database (specific database or all databases), objects (specific or all objects). We can also specify to update all, column or index statistics only.
We can further choose the scan type as a Full Scan or sample by. In the Sample by, we need to specify the sample percentage or sample rows as well.
Conclusion
In this article, we explored the concept of SQL Server Statistics and various options to update these statistics with both automated and manual methods. We should regularly monitor the statistics and update them as per the requirement. You can also go through SQL Server Statistics in Always On Availability Groups to get SQL Server statistics behavior in AG group databases.
- Understanding PostgreSQL SUBSTRING function - September 21, 2024
- How to install PostgreSQL on Ubuntu - July 13, 2023
- How to use the CROSSTAB function in PostgreSQL - February 17, 2023