Managing untrusted foreign keys

Introduction

Intended audience

This article is intended for application developers and database administrators who plan to develop, deploy, and/or assess solutions for Microsoft SQL Server on a Microsoft Windows platform.

Typographical Conventions

Convention	Meaning
Stylized Consolas Font	Used for blocks of code, commands and script examples. Text should be interpreted exactly as presented
Consolas Font	Used for inline code, commands or examples. Text should be interpreted exactly as presented
<italic font in brackets>	Italic texts set in angle brackets denote a variable requiring substitution for a real value
Italic font	Used to denote the title of a book, article, or other publication
Note	Additional information or caveats

Overview

The subject of this article is untrusted foreign keys. Those three words together have a meaning but each word has to be well understood before that. That’s the reason why this article will start with a definitions section where we will define what is a key, what is a foreign key, and how a foreign key can be “untrusted”. Then, we will cover the way to discover and repair untrusted foreign keys for a given database. Finally, we will demonstrate in detail how foreign keys influence the execution of a query when it does not exist, when it exists and is “trusted” and when it exists and is “untrusted”.

Definitions

Before going any further, let’s set the base and answer the following questions: « What is a key? », « What is a primary key? » and finally « What is a foreign key? », which is the topic of this article.

Then, we can talk about what the difference between a trusted and an untrusted foreign key.

What is a key for a relational table?

A key is basically an identifier, something that can uniquely identify a record in a relational table. Keys may consist of a single attribute or multiple attributes in combination. Based on the design of a table, there can be more than just one key which identify a record. We will refer to them as candidate keys.

Example:

Let’s have a look at the following table. It represents a list of students with 4 columns: a random numeric identifier and the firstname, lastname and sex of the student.

StudentID	FirstName	LastName	Sex
5345664	Adam	Kent	Male
8795165	Jefferson	Elias	Male

There are two candidate keys here: the StudentID, and the combination of FirstName and LastName columns.

Let’s add a table with the courses that are given in a particular school. The table will contain a unique identifier for the course, a title and a description.

CourseId	Title	CourseDescription
7897	Networks	An introductive course on network topologies and standards
8975	Numerical Analysis	Basics on numerical analysis.

There are, also, two candidate keys here: the CourseId column alone and the Title column alone.

What is a primary key for a relational table?

The primary key of a relational table is a key that is considered as the most appropriate one among all the acceptable keys for this table. As a key, it can either be a normal attribute that is guaranteed to be unique such as a unique random alpha-numeric identifier or it can be generated by the DBMS.

In our example, the most appropriate key to identify a given student record is the StudentID column.

What is a foreign key in a relational table?

A foreign key is a basically a reference to another table in a DBMS. It comprises all the key columns composing the primary key of that « foreign » or « parent » table. One who defines a foreign key creates at the same time a referential constraint that checks data provided as a foreign key reference an existing primary key value in the parent table. This constraint must be validated by the DBMS before accepting any kind of modification on the table. So, we can say that a foreign key ensures data integrity.

If we carry on with our example, let’s say we have table called StudentEnrollments which keeps track of the enrollments a student has made for current year and his final mark.

StudentId	CourseId	FinalMark
5345664	7897	16
8795165	8975	9
5345664	7897	18

Here we have two foreign keys:

StudentID column refers to the primary key in the Students table presented previously ;
CourseId column refers to the primary key in the StudentEnrollments table also presented previously.

As a final question for readers, what could possibly be the primary key we can use for this particular table?

Answer: the combination of both foreign keys, i.e. (StudentId,CourseId).

Note
Usually, foreign key references the primary key in a different table. It’s what have been shown in the examples. It’s also good to use it inside the same table. Such a design can be used to create hierarchical ordering in a table. You will find below an example of this: a table called department which has a unique identifier and can be member of another department.

DeptId	Name	HeadOfDept_id	ParentDeptId
10	Finance	2	(null)
20	Accounting	23	10

Here the Finance department has no parent department but has a child department which is the Accounting.

Disabling a foreign key constraint or the way to untrust a foreign key

SQL Server allows you to temporarily disable any CHECK or FOREIGN KEY constraint. Common use cases for this particular features are in the area of copying or importing data to a table faster. The process will work for sure, but one who does this operation must ensure that data he’s copying won’t violate those constraints before activate it again.

Here is the syntax to be used to disable then enable a constraint:

ALTER TABLE <SchemaName>.<TableName>

NOCHECK CONSTRAINT <ConstraintName>;

ALTER TABLE <SchemaName>.<TableName>

WITH { CHECK | NOCHECK }

CHECK CONSTRAINT <ConstraintName >;

Specifying WITH CHECK in a statement tells to SQL Server the user wants it to validate the constraint against every single row in the table, then, if successful, enable it.

In contrast, specifying WITH NOCHECK, which is the default for an existing constraint, means that the constraint is enabled but no validation has been made on it. Even if this mode is faster to run, it can lead to severe side effects on performance: SQL Server doesn’t trust the constraint as it has not validated it. We refer to such a foreign key as an « untrusted foreign key ». As a consequence, the query optimizer won’t use the constraint to do his job…

Human error can occur: one can forget to re-enable the constraint.

The proof of this will be given in the demo section.

Detect untrusted foreign keys and take the appropriate action

There is a simple way to detect whether a database contains one or more untrusted foreign keys. It’s simply by querying the sys.foreign_keys view in that database and check the is_not_trusted column. If this column is set to 1, it means the constraint is untrusted.

Here is a possible version for the query:

SELECT '[' + s.name + '].[' + o.name + '].[' + i.name + ']' AS keyname

FROM sys.foreign_keys i

INNER JOIN sys.objects o ON i.parent_object_id = o.object_id

INNER JOIN sys.schemas s ON o.schema_id = s.schema_id

WHERE i.is_not_trusted = 1

AND i.is_not_for_replication = 0

AND i.is_disabled = 0

This method is simple but not practical because you must do it against every single database on a SQL Server instance. Instead, we developed a stored procedure that will return a dataset with all the untrusted foreign keys for every database plus the code to solve the issue.

The procedure is called [Administration].[GetUntrustedForeignKeys]. It creates a global temporary table you can then reuse in any other procedure. This table will contain the information described above.

You will find below its interface. You can specify a database name. If you don’t, it will run against every accessible database on the instance. There are also some parameters that influence the behavior of the procedure.

PROCEDURE [Administration].[GetUntrustedForeignKeys] (

@DbName VARCHAR(256) = NULL,

@_NoTempTableDropBefore BIT = 0,

@_NoTempTableDropAtTheEnd BIT = 0,

@_NoSelectOnTempTable BIT = 0

)

The table that is sent back is of the current form:

DbName VARCHAR(256),

SchemaName VARCHAR(256),

TableName VARCHAR(256),

ForeignKeyName VARCHAR(256),

DDL2Resolve VARCHAR(MAX) – computed column

In addition, we defined another stored procedure built on the previous one that effectively runs the code defined in the DDL2Resolve column.

It’s called [Administration].[RunCheckUntrustedForeignKeys].

Foreign key and performance (Demo)

Let’s begin this section by creating a database for testing and some tables with primary keys. The tables will be the ones shown in the examples: Students, StudentEnrollments and Courses.

Database Creation Statements.

CREATE DATABASE [TestFK]

CONTAINMENT = NONE

ON PRIMARY (

NAME = N'TestFK',

FILENAME = N'E:\DBASQL2K12T\TestFK.mdf' ,

SIZE = 131072KB ,

MAXSIZE = 16777216KB ,

FILEGROWTH = 262144KB

)

LOG ON (

NAME = N'TestFK_log',

FILENAME = N'L:\DBASQL2K12T\TestFK_log.ldf' ,

SIZE = 262144KB ,

MAXSIZE = 16777216KB ,

FILEGROWTH = 262144KB

);

USE [TestFK]

IF NOT EXISTS (

SELECT name

FROM sys.filegroups

WHERE is_default=1 AND name = N'PRIMARY'

)

BEGIN

ALTER DATABASE [TestFK] MODIFY FILEGROUP [PRIMARY] DEFAULT;

END;

Creation of the Students table.

use TestFK;

IF(OBJECT_ID('[dbo].[Students]') IS NULL)

BEGIN

CREATE TABLE [dbo].[Students] (

StudentID INT IDENTITY(1,1) PRIMARY KEY,

FirstName VARCHAR(128) NOT NULL,

LastName VARCHAR(128) NOT NULL,

Sex VARCHAR(6) NOT NULL

);

END;

Creation of the Courses table.

use TestFK;

IF(OBJECT_ID('[dbo].[Courses]') IS NULL)

BEGIN

CREATE TABLE [dbo].[Courses] (

CourseID INT IDENTITY(1,1) PRIMARY KEY,

Title VARCHAR(128) NOT NULL,

CourseDescription VARCHAR(MAX)

);

END;

Creation of the StudentEnrollments table.

use TestFK;

IF(OBJECT_ID('[dbo].[StudentEnrollments]') IS NULL)

BEGIN

CREATE TABLE [dbo].[StudentEnrollments] (

StudentID INT NOT NULL,

CourseID INT NOT NULL,

FinalMark INT NULL

);

END;

IF NOT EXISTS (

SELECT 1

FROM sys.indexes

WHERE object_id = OBJECT_ID(N'[dbo].[StudentEnrollments]')

AND name = N'PK_StudentEnrollments'

)

BEGIN

ALTER TABLE [dbo].[StudentEnrollments]

ADD CONSTRAINT [PK_StudentEnrollments] PRIMARY KEY CLUSTERED (

[StudentID] ASC,

[CourseID] ASC

);

END;

As you may have noticed, we haven’t created the foreign key constraints yet. Let’s first get an overview of the way SQL Server handles a query without this foreign key.

You will find below a query that uses the StudentEnrollments table with the other two tables to get back a view of the name and firstname of a student following a course.

set statistics io on

set statistics time on

select s.FirstName,s.LastName

from [dbo].[StudentEnrollments] se

inner join [dbo].[Students] s

on se.StudentID = s.StudentID

inner join [dbo].[Courses] c

on se.CourseID = c.CourseID ;

Here is its actual execution plan.

Here are statistics on I/O and time:

Let’s now create the foreign key references and see if there is a difference.

Create the Foreign Key constraint to the Courses table.

USE [TestFK]

IF NOT EXISTS (

SELECT 1

FROM sys.foreign_keys

WHERE

object_id = OBJECT_ID(N'[dbo].[FK_StudentEnrollments_Courses]')

AND parent_object_id = OBJECT_ID(N'[dbo].[StudentEnrollments]')

)

BEGIN

ALTER TABLE [dbo].[StudentEnrollments]

ADD CONSTRAINT [FK_StudentEnrollments_Courses]

FOREIGN KEY([CourseID])

REFERENCES [dbo].[Courses] ([CourseID])

;

END;

Create the Foreign Key constraint to the Students table.

USE [TestFK]

IF NOT EXISTS (

SELECT 1

FROM sys.foreign_keys

WHERE

object_id = OBJECT_ID(N'[dbo].[FK_StudentEnrollments_Students]')

AND parent_object_id = OBJECT_ID(N'[dbo].[StudentEnrollments]')

)

BEGIN

ALTER TABLE [dbo].[StudentEnrollments]

ADD CONSTRAINT [FK_StudentEnrollments_Students]

FOREIGN KEY([StudentID])

REFERENCES [dbo].[Students] ([StudentID])

;

END;

The newer foreign keys change the execution plan of the query we used for test as you will see below. But first, the previous query:

set statistics io on

set statistics time on

select s.FirstName,s.LastName

from [dbo].[StudentEnrollments] se

inner join [dbo].[Students] s

on se.StudentID = s.StudentID

inner join [dbo].[Courses] c

on se.CourseID = c.CourseID ;

And the resulting actual plan:

There is no access to the [dbo].[Courses] table introduced by the inner join due to foreign key addition. As a conclusion, foreign keys may reduce I/O, CPU and time for SELECT queries.

Now, we will deactivate the foreign keys and insert data into tables then reactivate the foreign key. We will see whether this implies a different behavior or not and check that the foreign keys are untrusted.

Alter table [dbo].[StudentEnrollments]

nocheck constraint [FK_StudentEnrollments_Students];

Alter table [dbo].[StudentEnrollments]

nocheck constraint [FK_StudentEnrollments_Courses];

Are those foreign keys untrusted (using the query shown in Section 4).

The query:

SELECT '[' + s.name + '].[' + o.name + '].[' + i.name + ']' AS keyname

FROM sys.foreign_keys i

INNER JOIN sys.objects o ON i.parent_object_id = o.object_id

INNER JOIN sys.schemas s ON o.schema_id = s.schema_id

WHERE i.is_not_trusted = 1

AND i.is_not_for_replication = 0

AND i.is_disabled = 0

Its result:

Conclusion: No, the foreign keys are not untrusted at the moment.

Let’s populate the tables.

SET NOCOUNT ON ;

DECLARE @RowCount INT;

DECLARE @CurrentSex VARCHAR(6);

DECLARE @RowIdStudent INT;

SET @RowCount = 0;

BEGIN TRANSACTION;

WHILE @RowCount < 21

BEGIN

if(@RowCount %2 = 0)

BEGIN

SET @CurrentSex = 'Male';

END;

ELSE

BEGIN

SET @CurrentSex = 'Female';

END;

insert into [dbo].[Students](FirstName,LastName,Sex)

values (

REPLICATE('a',@RowCount%6+1),

REPLICATE('b',@RowCount%6+1),

@CurrentSex

);

If(@RowCount % 10 = 0)

BEGIN

Insert into [dbo].[Courses] (

Title,CourseDescription

)

values(

'Course : ' + CONVERT(VARCHAR(20),@RowCount),

'Course from ' + CONVERT(VARCHAR(20),@RowCount)

);

END;

SET @RowCount = @RowCount + 1;

END;

COMMIT;

insert into StudentEnrollments (

StudentID,CourseID

)

select StudentId, 1

from Students where StudentID % 7 = 0

union all

select StudentId, 2

from Students where StudentID % 7 = 1;

Are the foreign keys untrusted now?

No, not yet as shown by the result of our simple detection query:

So, let’s re-enable foreign keys.

Alter table [dbo].[StudentEnrollments]

check constraint [FK_StudentEnrollments_Students];

Alter table [dbo].[StudentEnrollments]

check constraint [FK_StudentEnrollments_Courses];

Are the foreign keys untrusted now?

Definitely, yes as shown by the result of our simple detection query:

Did it change something in the query plan for our test query? Let’s run it again and check its actual execution plan and statistics.

The actual query plan is the following one:

While the statistics are:

Now let’s run our procedure [Administration].[GetUntrustedForeignKeys].

Here is the T-SQL code to run the procedure for the test database alone:

exec SAIDBA.[Administration].[GetUntrustedForeignKeys]

@DbName = 'TestFK';

Here is the result given by the procedure:

The DDL2Resolve value for the first untrusted foreign is:

USE [TestFK];

PRINT 'TestFK.dbo.StudentEnrollments(FK_StudentEnrollments_Courses)';

ALTER TABLE [dbo].[StudentEnrollments]

WITH CHECK CHECK CONSTRAINT [FK_StudentEnrollments_Courses];

Let’s run both commands as the number of untrusted foreign keys is limited.

Did we finally solve the problem? Let’s check that there is no untrusted foreign key anymore.

The simple audit query:

SELECT '[' + s.name + '].[' + o.name + '].[' + i.name + ']' AS keyname

FROM sys.foreign_keys i

INNER JOIN sys.objects o ON i.parent_object_id = o.object_id

INNER JOIN sys.schemas s ON o.schema_id = s.schema_id

WHERE i.is_not_trusted = 1

AND i.is_not_for_replication = 0

AND i.is_disabled = 0

Its result:

The test query:

set statistics io on

set statistics time on

select s.FirstName,s.LastName

from [dbo].[StudentEnrollments] se

inner join [dbo].[Students] s

on se.StudentID = s.StudentID

inner join [dbo].[Courses] c

on se.CourseID = c.CourseID ;

Its actual plan:

Its execution statistics:

As the demo comes to its end, there is a final action to do: cleanup.

Database Cleanup Statements.

Use [TestFK];

TRUNCATE TABLE Courses;

TRUNCATE TABLE Students;

TRUNCATE TABLE StudentEnrollments;

IF(OBJECT_ID('[dbo].[StudentEnrollments]') IS NOT NULL)

BEGIN

EXEC sp_executesql N'DROP TABLE [dbo].[StudentEnrollments]';

END;

IF(OBJECT_ID('[dbo].[Students]') IS NOT NULL)

BEGIN

EXEC sp_executesql N'DROP TABLE [dbo].[Students]';

END;

IF(OBJECT_ID('[dbo].[Courses]') IS NOT NULL)

BEGIN

EXEC sp_executesql N'DROP TABLE [dbo].[Courses]';

END;

USE [master];

IF(DB_ID('TestFK') IS NOT NULL)

BEGIN

EXEC sp_executesql N'DROP DATABASE [TestFK]';

END;

Author
Recent Posts

Jefferson Elias

Living in Belgium, I obtained a master degree in Computer Sciences in 2011 at the University of Liege.

I'm one of the rare guys out there who started to work as a DBA immediately after his graduation. So, I work at the university hospital of Liege since 2011. Initially involved in Oracle Database administration (which are still under my charge), I had the opportunity to learn and manage SQL Server instances in 2013. Since 2013, I've learned a lot about SQL Server in administration and development.

I like the job of DBA because you need to have a general knowledge in every field of IT. That's the reason why I won't stop learning (and share) the products of my learnings.

View all posts by Jefferson Elias

Latest posts by Jefferson Elias (see all)

How to perform a performance test against a SQL Server instance - September 14, 2018
Concurrency problems – theory and experimentation in SQL Server - July 24, 2018
How to link two SQL Server instances with Kerberos - July 5, 2018

Introduction

Definitions

Detect untrusted foreign keys and take the appropriate action

Foreign key and performance (Demo)

Related posts: