Introduction
Intended audience
This article is intended for application developers and database administrators who plan to develop, deploy, and/or assess solutions for Microsoft SQL Server on a Microsoft Windows platform.
Typographical Conventions
Convention | Meaning |
Stylized Consolas Font |
Used for blocks of code, commands and script examples. Text should be interpreted exactly as presented |
Consolas Font |
Used for inline code, commands or examples. Text should be interpreted exactly as presented |
<italic font in brackets> | Italic texts set in angle brackets denote a variable requiring substitution for a real value |
Italic font | Used to denote the title of a book, article, or other publication |
Note | Additional information or caveats |
Overview
The subject of this article is untrusted foreign keys. Those three words together have a meaning but each word has to be well understood before that. That’s the reason why this article will start with a definitions section where we will define what is a key, what is a foreign key, and how a foreign key can be “untrusted”. Then, we will cover the way to discover and repair untrusted foreign keys for a given database. Finally, we will demonstrate in detail how foreign keys influence the execution of a query when it does not exist, when it exists and is “trusted” and when it exists and is “untrusted”.
Definitions
Before going any further, let’s set the base and answer the following questions: « What is a key? », « What is a primary key? » and finally « What is a foreign key? », which is the topic of this article.
Then, we can talk about what the difference between a trusted and an untrusted foreign key.
What is a key for a relational table?
A key is basically an identifier, something that can uniquely identify a record in a relational table. Keys may consist of a single attribute or multiple attributes in combination. Based on the design of a table, there can be more than just one key which identify a record. We will refer to them as candidate keys.
Example:
Let’s have a look at the following table. It represents a list of students with 4 columns: a random numeric identifier and the firstname, lastname and sex of the student.
StudentID | FirstName | LastName | Sex |
5345664 | Adam | Kent | Male |
8795165 | Jefferson | Elias | Male |
There are two candidate keys here: the StudentID, and the combination of FirstName and LastName columns.
Let’s add a table with the courses that are given in a particular school. The table will contain a unique identifier for the course, a title and a description.
CourseId | Title | CourseDescription |
7897 | Networks | An introductive course on network topologies and standards |
8975 | Numerical Analysis | Basics on numerical analysis. |
There are, also, two candidate keys here: the CourseId column alone and the Title column alone.
What is a primary key for a relational table?
The primary key of a relational table is a key that is considered as the most appropriate one among all the acceptable keys for this table. As a key, it can either be a normal attribute that is guaranteed to be unique such as a unique random alpha-numeric identifier or it can be generated by the DBMS.
In our example, the most appropriate key to identify a given student record is the StudentID column.
What is a foreign key in a relational table?
A foreign key is a basically a reference to another table in a DBMS. It comprises all the key columns composing the primary key of that « foreign » or « parent » table. One who defines a foreign key creates at the same time a referential constraint that checks data provided as a foreign key reference an existing primary key value in the parent table. This constraint must be validated by the DBMS before accepting any kind of modification on the table. So, we can say that a foreign key ensures data integrity.
If we carry on with our example, let’s say we have table called StudentEnrollments which keeps track of the enrollments a student has made for current year and his final mark.
StudentId | CourseId | FinalMark |
5345664 | 7897 | 16 |
8795165 | 8975 | 9 |
5345664 | 7897 | 18 |
Here we have two foreign keys:
- StudentID column refers to the primary key in the Students table presented previously ;
- CourseId column refers to the primary key in the StudentEnrollments table also presented previously.
As a final question for readers, what could possibly be the primary key we can use for this particular table?
Answer: the combination of both foreign keys, i.e. (StudentId,CourseId).
Note
Usually, foreign key references the primary key in a different table. It’s what have been shown in the examples. It’s also good to use it inside the same table. Such a design can be used to create hierarchical ordering in a table. You will find below an example of this: a table called department which has a unique identifier and can be member of another department.
DeptId | Name | HeadOfDept_id | ParentDeptId |
10 | Finance | 2 | (null) |
20 | Accounting | 23 | 10 |
Here the Finance department has no parent department but has a child department which is the Accounting.
Disabling a foreign key constraint or the way to untrust a foreign key
SQL Server allows you to temporarily disable any CHECK or FOREIGN KEY constraint. Common use cases for this particular features are in the area of copying or importing data to a table faster. The process will work for sure, but one who does this operation must ensure that data he’s copying won’t violate those constraints before activate it again.
Here is the syntax to be used to disable then enable a constraint:
1 2 3 4 5 6 7 |
ALTER TABLE <SchemaName>.<TableName> NOCHECK CONSTRAINT <ConstraintName>; ALTER TABLE <SchemaName>.<TableName> WITH { CHECK | NOCHECK } CHECK CONSTRAINT <ConstraintName >; |
Specifying WITH CHECK in a statement tells to SQL Server the user wants it to validate the constraint against every single row in the table, then, if successful, enable it.
In contrast, specifying WITH NOCHECK, which is the default for an existing constraint, means that the constraint is enabled but no validation has been made on it. Even if this mode is faster to run, it can lead to severe side effects on performance: SQL Server doesn’t trust the constraint as it has not validated it. We refer to such a foreign key as an « untrusted foreign key ». As a consequence, the query optimizer won’t use the constraint to do his job…
Human error can occur: one can forget to re-enable the constraint.
The proof of this will be given in the demo section.
Detect untrusted foreign keys and take the appropriate action
There is a simple way to detect whether a database contains one or more untrusted foreign keys. It’s simply by querying the sys.foreign_keys view in that database and check the is_not_trusted column. If this column is set to 1, it means the constraint is untrusted.
Here is a possible version for the query:
1 2 3 4 5 6 7 8 9 |
SELECT '[' + s.name + '].[' + o.name + '].[' + i.name + ']' AS keyname FROM sys.foreign_keys i INNER JOIN sys.objects o ON i.parent_object_id = o.object_id INNER JOIN sys.schemas s ON o.schema_id = s.schema_id WHERE i.is_not_trusted = 1 AND i.is_not_for_replication = 0 AND i.is_disabled = 0 |
This method is simple but not practical because you must do it against every single database on a SQL Server instance. Instead, we developed a stored procedure that will return a dataset with all the untrusted foreign keys for every database plus the code to solve the issue.
The procedure is called [Administration].[GetUntrustedForeignKeys]. It creates a global temporary table you can then reuse in any other procedure. This table will contain the information described above.
You will find below its interface. You can specify a database name. If you don’t, it will run against every accessible database on the instance. There are also some parameters that influence the behavior of the procedure.
1 2 3 4 5 6 7 8 |
PROCEDURE [Administration].[GetUntrustedForeignKeys] ( @DbName VARCHAR(256) = NULL, @_NoTempTableDropBefore BIT = 0, @_NoTempTableDropAtTheEnd BIT = 0, @_NoSelectOnTempTable BIT = 0 ) |
The table that is sent back is of the current form:
1 2 3 4 5 6 7 |
DbName VARCHAR(256), SchemaName VARCHAR(256), TableName VARCHAR(256), ForeignKeyName VARCHAR(256), DDL2Resolve VARCHAR(MAX) – computed column |
In addition, we defined another stored procedure built on the previous one that effectively runs the code defined in the DDL2Resolve column.
It’s called [Administration].[RunCheckUntrustedForeignKeys].
Foreign key and performance (Demo)
Let’s begin this section by creating a database for testing and some tables with primary keys. The tables will be the ones shown in the examples: Students, StudentEnrollments and Courses.
Database Creation Statements.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
CREATE DATABASE [TestFK] CONTAINMENT = NONE ON PRIMARY ( NAME = N'TestFK', FILENAME = N'E:\DBASQL2K12T\TestFK.mdf' , SIZE = 131072KB , MAXSIZE = 16777216KB , FILEGROWTH = 262144KB ) LOG ON ( NAME = N'TestFK_log', FILENAME = N'L:\DBASQL2K12T\TestFK_log.ldf' , SIZE = 262144KB , MAXSIZE = 16777216KB , FILEGROWTH = 262144KB ); GO USE [TestFK] GO IF NOT EXISTS ( SELECT name FROM sys.filegroups WHERE is_default=1 AND name = N'PRIMARY' ) BEGIN ALTER DATABASE [TestFK] MODIFY FILEGROUP [PRIMARY] DEFAULT; END; GO |
Creation of the Students table.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
use TestFK; GO IF(OBJECT_ID('[dbo].[Students]') IS NULL) BEGIN CREATE TABLE [dbo].[Students] ( StudentID INT IDENTITY(1,1) PRIMARY KEY, FirstName VARCHAR(128) NOT NULL, LastName VARCHAR(128) NOT NULL, Sex VARCHAR(6) NOT NULL ); END; |
Creation of the Courses table.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
use TestFK; GO IF(OBJECT_ID('[dbo].[Courses]') IS NULL) BEGIN CREATE TABLE [dbo].[Courses] ( CourseID INT IDENTITY(1,1) PRIMARY KEY, Title VARCHAR(128) NOT NULL, CourseDescription VARCHAR(MAX) ); END; |
Creation of the StudentEnrollments table.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
use TestFK; GO IF(OBJECT_ID('[dbo].[StudentEnrollments]') IS NULL) BEGIN CREATE TABLE [dbo].[StudentEnrollments] ( StudentID INT NOT NULL, CourseID INT NOT NULL, FinalMark INT NULL ); END; IF NOT EXISTS ( SELECT 1 FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[StudentEnrollments]') AND name = N'PK_StudentEnrollments' ) BEGIN ALTER TABLE [dbo].[StudentEnrollments] ADD CONSTRAINT [PK_StudentEnrollments] PRIMARY KEY CLUSTERED ( [StudentID] ASC, [CourseID] ASC ); END; GO |
As you may have noticed, we haven’t created the foreign key constraints yet. Let’s first get an overview of the way SQL Server handles a query without this foreign key.
You will find below a query that uses the StudentEnrollments table with the other two tables to get back a view of the name and firstname of a student following a course.
1 2 3 4 5 6 7 8 9 10 |
set statistics io on set statistics time on select s.FirstName,s.LastName from [dbo].[StudentEnrollments] se inner join [dbo].[Students] s on se.StudentID = s.StudentID inner join [dbo].[Courses] c on se.CourseID = c.CourseID ; |
Here is its actual execution plan.
Here are statistics on I/O and time:
Let’s now create the foreign key references and see if there is a difference.
Create the Foreign Key constraint to the Courses table.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
USE [TestFK] GO IF NOT EXISTS ( SELECT 1 FROM sys.foreign_keys WHERE object_id = OBJECT_ID(N'[dbo].[FK_StudentEnrollments_Courses]') AND parent_object_id = OBJECT_ID(N'[dbo].[StudentEnrollments]') ) BEGIN ALTER TABLE [dbo].[StudentEnrollments] ADD CONSTRAINT [FK_StudentEnrollments_Courses] FOREIGN KEY([CourseID]) REFERENCES [dbo].[Courses] ([CourseID]) ; END; GO |
Create the Foreign Key constraint to the Students table.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
USE [TestFK] GO IF NOT EXISTS ( SELECT 1 FROM sys.foreign_keys WHERE object_id = OBJECT_ID(N'[dbo].[FK_StudentEnrollments_Students]') AND parent_object_id = OBJECT_ID(N'[dbo].[StudentEnrollments]') ) BEGIN ALTER TABLE [dbo].[StudentEnrollments] ADD CONSTRAINT [FK_StudentEnrollments_Students] FOREIGN KEY([StudentID]) REFERENCES [dbo].[Students] ([StudentID]) ; END; GO |
The newer foreign keys change the execution plan of the query we used for test as you will see below. But first, the previous query:
1 2 3 4 5 6 7 8 9 10 |
set statistics io on set statistics time on select s.FirstName,s.LastName from [dbo].[StudentEnrollments] se inner join [dbo].[Students] s on se.StudentID = s.StudentID inner join [dbo].[Courses] c on se.CourseID = c.CourseID ; |
And the resulting actual plan:
There is no access to the [dbo].[Courses] table introduced by the inner join due to foreign key addition. As a conclusion, foreign keys may reduce I/O, CPU and time for SELECT queries.
Now, we will deactivate the foreign keys and insert data into tables then reactivate the foreign key. We will see whether this implies a different behavior or not and check that the foreign keys are untrusted.
1 2 3 4 5 6 |
Alter table [dbo].[StudentEnrollments] nocheck constraint [FK_StudentEnrollments_Students]; Alter table [dbo].[StudentEnrollments] nocheck constraint [FK_StudentEnrollments_Courses]; |
Are those foreign keys untrusted (using the query shown in Section 4).
The query:
1 2 3 4 5 6 7 8 9 |
SELECT '[' + s.name + '].[' + o.name + '].[' + i.name + ']' AS keyname FROM sys.foreign_keys i INNER JOIN sys.objects o ON i.parent_object_id = o.object_id INNER JOIN sys.schemas s ON o.schema_id = s.schema_id WHERE i.is_not_trusted = 1 AND i.is_not_for_replication = 0 AND i.is_disabled = 0 |
Its result:
Conclusion: No, the foreign keys are not untrusted at the moment.
Let’s populate the tables.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
SET NOCOUNT ON ; DECLARE @RowCount INT; DECLARE @CurrentSex VARCHAR(6); DECLARE @RowIdStudent INT; SET @RowCount = 0; BEGIN TRANSACTION; WHILE @RowCount < 21 BEGIN if(@RowCount %2 = 0) BEGIN SET @CurrentSex = 'Male'; END; ELSE BEGIN SET @CurrentSex = 'Female'; END; insert into [dbo].[Students](FirstName,LastName,Sex) values ( REPLICATE('a',@RowCount%6+1), REPLICATE('b',@RowCount%6+1), @CurrentSex ); If(@RowCount % 10 = 0) BEGIN Insert into [dbo].[Courses] ( Title,CourseDescription ) values( 'Course : ' + CONVERT(VARCHAR(20),@RowCount), 'Course from ' + CONVERT(VARCHAR(20),@RowCount) ); END; SET @RowCount = @RowCount + 1; END; COMMIT; insert into StudentEnrollments ( StudentID,CourseID ) select StudentId, 1 from Students where StudentID % 7 = 0 union all select StudentId, 2 from Students where StudentID % 7 = 1; |
Are the foreign keys untrusted now?
No, not yet as shown by the result of our simple detection query:
So, let’s re-enable foreign keys.
1 2 3 4 5 6 |
Alter table [dbo].[StudentEnrollments] check constraint [FK_StudentEnrollments_Students]; Alter table [dbo].[StudentEnrollments] check constraint [FK_StudentEnrollments_Courses]; |
Are the foreign keys untrusted now?
Definitely, yes as shown by the result of our simple detection query:
Did it change something in the query plan for our test query? Let’s run it again and check its actual execution plan and statistics.
The actual query plan is the following one:
While the statistics are:
Now let’s run our procedure [Administration].[GetUntrustedForeignKeys].
Here is the T-SQL code to run the procedure for the test database alone:
1 2 3 4 |
exec SAIDBA.[Administration].[GetUntrustedForeignKeys] @DbName = 'TestFK'; |
Here is the result given by the procedure:
The DDL2Resolve value for the first untrusted foreign is:
1 2 3 4 5 6 |
USE [TestFK]; PRINT 'TestFK.dbo.StudentEnrollments(FK_StudentEnrollments_Courses)'; ALTER TABLE [dbo].[StudentEnrollments] WITH CHECK CHECK CONSTRAINT [FK_StudentEnrollments_Courses]; |
Let’s run both commands as the number of untrusted foreign keys is limited.
Did we finally solve the problem? Let’s check that there is no untrusted foreign key anymore.
The simple audit query:
1 2 3 4 5 6 7 8 9 |
SELECT '[' + s.name + '].[' + o.name + '].[' + i.name + ']' AS keyname FROM sys.foreign_keys i INNER JOIN sys.objects o ON i.parent_object_id = o.object_id INNER JOIN sys.schemas s ON o.schema_id = s.schema_id WHERE i.is_not_trusted = 1 AND i.is_not_for_replication = 0 AND i.is_disabled = 0 |
Its result:
The test query:
1 2 3 4 5 6 7 8 9 10 |
set statistics io on set statistics time on select s.FirstName,s.LastName from [dbo].[StudentEnrollments] se inner join [dbo].[Students] s on se.StudentID = s.StudentID inner join [dbo].[Courses] c on se.CourseID = c.CourseID ; |
Its actual plan:
Its execution statistics:
As the demo comes to its end, there is a final action to do: cleanup.
Database Cleanup Statements.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
Use [TestFK]; GO /* TRUNCATE TABLE Courses; TRUNCATE TABLE Students; TRUNCATE TABLE StudentEnrollments; */ IF(OBJECT_ID('[dbo].[StudentEnrollments]') IS NOT NULL) BEGIN EXEC sp_executesql N'DROP TABLE [dbo].[StudentEnrollments]'; END; GO IF(OBJECT_ID('[dbo].[Students]') IS NOT NULL) BEGIN EXEC sp_executesql N'DROP TABLE [dbo].[Students]'; END; GO IF(OBJECT_ID('[dbo].[Courses]') IS NOT NULL) BEGIN EXEC sp_executesql N'DROP TABLE [dbo].[Courses]'; END; GO USE [master]; GO IF(DB_ID('TestFK') IS NOT NULL) BEGIN EXEC sp_executesql N'DROP DATABASE [TestFK]'; END; GO |
- How to perform a performance test against a SQL Server instance - September 14, 2018
- Concurrency problems – theory and experimentation in SQL Server - July 24, 2018
- How to link two SQL Server instances with Kerberos - July 5, 2018