Why should you enable alerts
While the article “How to create and configure SQL Server Agent Alerts” created by Minette Steynberg a SQLShack author discusses the features around Alerting through SQL Agent and some conditions for testing. It does not go through some of the common alerts that you should have enabled on your SQL Server.
To give a brief overview, Agent Alerts can provide monitoring of important events or conditions using the alerts feature three different ways:
- SQL Server event alerts – alerts when a specific SQL Server even occurs
- SQL server performance counters – alerts when a performance counter reaches the specified threshold
- WMI events – alerts when WMI reaches a specific threshold
This is important to make sure you are properly keeping an eye on your SQL Server. In my opinion, monitoring WMI events or performance counters with SQL Server is an expensive substitute for a monitoring solution. Thus, we will be focusing on the SQL Server event alerts.
Important Severity Alerts
When implementing this, it is a common practice for DBAs to enable alerts for Severity 17 or higher on their SQL Servers. This is because these are not correctable by end users. Again, I believe everyone should have a monitoring solution in place even if it is just monitoring resources on the machine, for this reason, I only enable them for 18 and above. If you don’t have a monitoring solution though, please enable alerts for 17 also. Error 17 indicates that a statement caused SQL Server to run out of resources.
Full documentation around the severities can be found here.
Error Severity | What it indicates |
18 | There is a problem with the database engine software. |
19 | Nonconfigurable Database engine limits were exceeded, and the batch was terminated. |
20 | A statement has encountered a problem with the current task, unlikely to cause damage to the database itself. |
21 | A problem was encountered that affects all tasks in the database, unlikely to cause damage to the database itself. |
22 | The table or index specified in the message has been damaged by a software or hardware problem. |
23 | The integrity of the entire database is in question because of a hardware or software problem. |
24 | Media failure. Most likely means a restore of the database and a call to your hardware vendor. |
25 | Unexpected errors, this is the catch all for Microsoft SQL Server. |
As you can see from the descriptions in the chart, the errors typically take your machine offline and can even be results of some serious corruption and loss of data.
Error messages
Next, it is important to also mention that you should set up alerts for the following error messages 823, 824 and 825. These are signs that your underlying storage system having issues and should be investigated by your system administrator and hardware vendor. Additionally, if you receive these messages as a DBA you should check the suspect pages table in SQL server and run a CHECKDB. This will confirm your state of your database. To query your suspect pages table, use the query below, more details about the event types can be found here.
1 2 3 4 5 6 7 8 9 |
SELECT db_name(database_id) as database_name, file_id,page_id, error_count, last_update_date, CASE WHEN event_type = '1' THEN 'Error 823 or 824' WHEN event_type = '2' THEN 'Bad checksum on page' WHEN event_type = '3' THEN 'Torn page' ELSE 'restored after was damaged or repaired/deallocated by DBCC' END as Event_information FROM msdb..suspect_pages WHERE (event_type = 1 OR event_type = 2 OR event_type = 3); |
This is what you should see, a nice empty table.
However, to monitor this daily or more frequently, you can use a job to check this table and confirm that the count is zero, if not send an email using Database mail.
SQL Server logs all high severity issues and error messages 823,824 and 825 in the error log inside SQL Server. So, for more details, if the error does not destroy your DB, you will be able to get more information about what happened in there. If your database is lost, Check the event viewer application logs for the SQL Server source. This should contain the same errors with potentially the same or roughly the same information.
Building on this topic a bit more, since we mentioned potential corruption. It is something that can happen to anyone, this will not prevent corruption to your database. In which, you will ask well what can we do to prevent it? Sadly, there is not a good answer, you really can’t prevent it. This is due to the fact you may just be unlucky enough that the first error can be the worst one and you get causes corruption in your database. For example, if the damage is caused to your boot page (1:9), you have no choice but to restore from a backup. On top of that, corruption can come from memory, a bad checksum on a page or your disk.
However, in monitoring these errors it may give you a head start to plan a migration to a new disk subsystem or work with the vendor prior to encountering a problem. This sounds like a ton of work, but it is less painful when compared to recovering corruption or restoring from backup when your system is down.
To summarize at a high level, alerts should be created for:
-
Events with Severity >= 18 if you have a monitoring solution, but if you don’t, enable 17 and above alerts
- These are high-severity errors that should be investigated by the system administrator/DBA
-
Error 823,824,825 read-retry errors
- These errors spell doom for your disk subsystem
Finally, as a word of warning, always validate your backups, there is a saying that I have taken a liking to over the years, it goes, that your last backup is only as good as the one you verified you can restore too. There are many spins on this saying, but it is true, if you don’t know it will work, you are flying blind. Secondly, when receiving one of these errors, take it seriously especially when speaking to hardware vendors and your system admins. Make sure they understand the gravity of these errors and stress the importance of acting on these errors. After all, it could very well keep you from updating your resume and leaving town.
To quickly enable all these alerts and severities, you can run the following code block. This gives you an easy way to customize alert names along with the Database mail operator name without manually configuring all of these alerts by yourself. However, to note, validate your settings are populated or if you want to add anything in the fields additionally, make sure you want these alerts to go to email instead of that old pager in the office somewhere (wink).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 |
USE [msdb] GO DECLARE @Sev17 Varchar(200), @Sev18 Varchar(200), @Sev19 Varchar(200), @Sev20 Varchar(200), @Sev21 Varchar(200), @Sev22 Varchar(200), @Sev23 Varchar(200), @Sev24 Varchar(200), @Sev25 Varchar(200), @Error823 Varchar(200), @Error824 Varchar(200), @Error825 Varchar(200), @DBMailOperator Varchar(100) SET @DBMailOperator = 'DBA Team' SET @Sev17 = 'Severity 17' SET @Sev18 = 'Severity 18' SET @Sev19 = 'Severity 19' SET @Sev20 = 'Severity 20' SET @Sev21 = 'Severity 21' SET @Sev22 = 'Severity 22' SET @Sev23 = 'Severity 23' SET @Sev24 = 'Severity 24' SET @Sev25 = 'Severity 25' SET @Error823 = 'Error 823' SET @Error824 = 'Error 824' SET @Error825 = 'Error 825' EXEC msdb.dbo.sp_add_alert @Name=@sev17, @message_id=0, @severity=17, @enabled=1, @delay_between_responses=60, @include_event_description_in=1, @job_id=N'00000000-0000-0000-0000-000000000000'; EXEC msdb.dbo.sp_add_notification @alert_name=@sev17, @operator_name=@DBMailOperator, @notification_method = 7; EXEC msdb.dbo.sp_add_alert @Name=@Sev18, @message_id=0, @severity=18, @enabled=1, @delay_between_responses=60, @include_event_description_in=1, @job_id=N'00000000-0000-0000-0000-000000000000'; EXEC msdb.dbo.sp_add_notification @alert_name=@sev18, @operator_name=@DBMailOperator, @notification_method = 7; EXEC msdb.dbo.sp_add_alert @Name=@sev19, @message_id=0, @severity=19, @enabled=1, @delay_between_responses=60, @include_event_description_in=1, @job_id=N'00000000-0000-0000-0000-000000000000'; EXEC msdb.dbo.sp_add_notification @alert_name=@sev19, @operator_name=@DBMailOperator, @notification_method = 7; EXEC msdb.dbo.sp_add_alert @Name=@Sev20, @message_id=0, @severity=20, @enabled=1, @delay_between_responses=60, @include_event_description_in=1, @job_id=N'00000000-0000-0000-0000-000000000000'; EXEC msdb.dbo.sp_add_notification @alert_name=@Sev20, @operator_name=@DBMailOperator, @notification_method = 7; EXEC msdb.dbo.sp_add_alert @Name=@Sev21, @message_id=0, @severity=21, @enabled=1, @delay_between_responses=60, @include_event_description_in=1, @job_id=N'00000000-0000-0000-0000-000000000000'; EXEC msdb.dbo.sp_add_notification @alert_name=@Sev21, @operator_name=@DBMailOperator, @notification_method = 7; EXEC msdb.dbo.sp_add_alert @Name=@Sev22, @message_id=0, @severity=22, @enabled=1, @delay_between_responses=60, @include_event_description_in=1, @job_id=N'00000000-0000-0000-0000-000000000000'; EXEC msdb.dbo.sp_add_notification @alert_name=@Sev22, @operator_name=@DBMailOperator, @notification_method = 7; EXEC msdb.dbo.sp_add_alert @Name=@Sev23, @message_id=0, @severity=23, @enabled=1, @delay_between_responses=60, @include_event_description_in=1, @job_id=N'00000000-0000-0000-0000-000000000000'; EXEC msdb.dbo.sp_add_notification @alert_name=@Sev23, @operator_name=@DBMailOperator, @notification_method = 7; EXEC msdb.dbo.sp_add_alert @Name=@Sev24, @message_id=0, @severity=24, @enabled=1, @delay_between_responses=60, @include_event_description_in=1, @job_id=N'00000000-0000-0000-0000-000000000000'; EXEC msdb.dbo.sp_add_notification @alert_name=@Sev24, @operator_name=@DBMailOperator, @notification_method = 7; EXEC msdb.dbo.sp_add_alert @Name=@Sev25, @message_id=0, @severity=25, @enabled=1, @delay_between_responses=60, @include_event_description_in=1, @job_id=N'00000000-0000-0000-0000-000000000000'; EXEC msdb.dbo.sp_add_notification @alert_name=@Sev25, @operator_name=@DBMailOperator, @notification_method = 7; EXEC msdb.dbo.sp_add_alert @name=@Error823, @message_id=823, @severity=0, @enabled=1, @delay_between_responses=60, @include_event_description_in=1, @job_id=N'00000000-0000-0000-0000-000000000000' EXEC msdb.dbo.sp_add_notification @alert_name=@Error823, @operator_name=@DBMailOperator, @notification_method = 7; EXEC msdb.dbo.sp_add_alert @name=@Error824, @message_id=824, @severity=0, @enabled=1, @delay_between_responses=60, @include_event_description_in=1, @job_id=N'00000000-0000-0000-0000-000000000000' EXEC msdb.dbo.sp_add_notification @alert_name=@Error824, @operator_name=@DBMailOperator, @notification_method = 7; EXEC msdb.dbo.sp_add_alert @name=@Error825, @message_id=825, @severity=0, @enabled=1, @delay_between_responses=60, @include_event_description_in=1, @job_id=N'00000000-0000-0000-0000-000000000000' EXEC msdb.dbo.sp_add_notification @alert_name=@Error825, @operator_name=@DBMailOperator, @notification_method = 7; |
FAQs
What alerts should I enable?
High severity alerts, along with error messages 823, 824 and 825.
How can I quickly enable Database mail alerts in SQL Server?
Run the script in the article above.
Where are Database alerts logged?
High severity alerts are logged to the event viewer in the application log along with the SQL Server error log. Read more in the article above.
- Preparing for SQL Server Database Corruption; initial reaction and analysis - February 6, 2019
- Considerations and concerns when setting up Database mail for SQL Server - August 31, 2018
- SQL Server Setup – Instant File Initialization (IFI) - August 31, 2018