In this article, we will review the Collate SQL command. First, let us see what collation in the SQL Server is.
Collation is a set of rules that tell database engine how to compare and sort the character data in SQL Server.
Collation can be set at different levels in SQL Server. Below are the three levels:
- SQL Server level
- Database level
- Column level
SQL Server level
Collation setting at SQL Server level can be specified while installing SQL Server. Please refer to the below image:
The collation of the SQL Server is set to SQL_Latin1_General_CP1_CI_AS.
Break down of the collation setting is as below:
- SQL – All SQL Server collations will have the prefix SQL
- Latin1_General – represents the sort rule, CI means case insensitive and AS means accent sensitive
Execute the following T-SQL script to know the collation of the SQL Server instance:
1 |
SELECT SERVERPROPERTY('collation') |
You can also use the stored procedure sp_helpsort to know the collation of the SQL server instance.
Database level
By default, all system and user databases will inherit the collation setting that was specified at the SQL Server level during installation. The below image shows the default collation inherited from the server level when collation is not specified explicitly while creating a database. Here the collation of SQL Server instance is SQL_Latin1_General_CP1_CI_AS which is the same as the database:
1 2 |
CREATE DATABASE [Database_DefaultCollation] GO |
We can also specify the collation of a database while creating the database using the Collate SQL command. Use the below T-SQL script which creates the database with collation SQL_Latin1_General_CP1_CS_AS:
1 2 3 |
CREATE DATABASE [Database_WithCollation] COLLATE SQL_Latin1_General_CP1_CS_AS GO |
Please refer to the below image that shows the database is created with the collation specified using the Collate SQL command:
You can also modify the database collation after creating the database. In this case, use the below T-SQL scripts which modifies the collation setting of a database:
1 2 3 4 |
USE [master] GO ALTER DATABASE [Database_WithCollation] COLLATE SQL_Latin1_General_CP1_CI_AS GO |
Execute the following T-SQL script to know the collation of the database. In this case, the “Sample” is the name of the database. Replace it with yours:
1 |
SELECT DATABASEPROPERTYEX('Sample','collation'); |
Column level
By default, a new character type column inherits the collation of the database unless you specify the collation explicitly while creating the table.
To create a column with a different collation you can specify collation using Collate SQL command. Please refer to the below T-SQL script which creates a column with the specified collation:
1 |
CREATE TABLE Products (ProductID int, ProductName varchar(50) COLLATE SQL_Latin1_General_CP437_BIN) |
Collation setting can be specified on the below character column types:
- varchar
- nvarchar
- char
- nchar
- text
- ntext
To view the collation of a column on the table, execute the following stored procedure with the table name as a parameter:
1 |
sp_help Product |
Examples
Let us create a table and insert sample data for the demo purpose:
1 2 3 4 5 |
create table Product (ProductID int, ProductName varchar(50)) GO INSERT INTO Product values (1,'Book'), (2,'book'), (3,'ᴃook') GO |
Now let us query the Product table. Please refer to the below image it returns two rows and ignores case as the collation is case insensitive. i.e. SQL_Latin1_General_CP1_CI_AS:
Let’s explicitly specify the collation using the Collate SQL command in the Select statement:
1 |
SELECT * FROM Product WHERE ProductName COLLATE SQL_Latin1_General_CP1_CS_AS ='Book' |
Check out the below image that shows only one row in the result set as we have specified the collation in the select statement which is case sensitive:
Let’s execute the following query that returns all the three records as the collation specified is case insensitive and accent insensitive:
1 |
SELECT * FROM Product WHERE ProductName COLLATE SQL_Latin1_General_CP850_CI_AI ='book' |
Joining the two columns of a different collation
For the demo purpose, I am creating two tables Products and ProductDesc with different collation on the column Pcode. Use the below T-SQL script:
1 2 3 4 5 6 7 8 9 10 |
create table Products (Pcode varchar(10), Pname varchar(50) ) INSERT INTO Products VALUES ('BK', 'Book'),('PN','Pen') CREATE TABLE ProductDesc (Pcode varchar(10) collate SQL_Latin1_General_CP850_CI_AI , Pdesc varchar(100)) INSERT INTO ProductDesc values ('BK','TEST'),('PN','TEST2') |
Let’s join these two tables on the column Pcode. It throws an error:
Msg 468, Level 16, State 9, Line 22
Cannot resolve the collation conflict between “SQL_Latin1_General_CP850_CI_AI” and “SQL_Latin1_General_CP1_CI_AS” in the equal to operation.
This is because the collation of the column Pcode in both tables are different:
1 2 |
SELECT * FROM Products P INNER JOIN ProductDesc D ON P.Pcode=D.Pcode |
To make the join work, you must use the Collate SQL command in the Select statement as shown in the below T-SQL script:
1 2 |
SELECT * FROM Products P INNER JOIN ProductDesc D ON P.Pcode=D.Pcode collate SQL_Latin1_General_CP1_CI_AS |
Tempdb collation
The collation of the tempdb will be the same as the collation of the SQL Server instance. In some cases, we may use different collations at the database level and server level. In this case, we may run to issues if we use temporary tables and making joins with the tables in the database.
For example, the collation of SQL Server instance is SQL_Latin1_General_CP1_CI_AS and I am creating a database with different collation, i.e. SQL_Latin1_General_CP437_BIN. Now create a table with a column of type varchar which will inherit the database collation SQL_Latin1_General_CP437_BIN:
1 2 3 4 5 6 7 8 9 10 |
CREATE DATABASE [SampleDB] COLLATE SQL_Latin1_General_CP437_BIN GO Use SampleDB GO create table Product (Pcode varchar(5), Pname varchar(30)) GO INSERT INTO Product values ('BK','Book') |
Now create a temp table with a column of type varchar which will inherit the collation of tempdb database i.e. SQL_Latin1_General_CP1_CI_AS:
1 2 3 4 5 |
create table #Product (Pcode varchar(5), Pname varchar(30)) GO INSERT INTO #Product values ('BK','Book') |
Now joining these tables on column Pcode will throw an error as the collations are different:
In these cases, we must explicitly specify the collation using Collate SQL command while creating a temp table in stored procedures or T-SQL batch.
It is better to use the same collation at the database level and SQL Server level in order to avoid such issues while performing operations on temp tables or cross-database operations when the two databases have a different collation.
Conclusion
In this article, we explored the collation setting in SQL Server at different levels and querying tables using the Collate SQL command. In case you have any questions, feel free to ask in the comment section below.
- Geo Replication on Transparent Data Encryption (TDE) enabled Azure SQL databases - October 24, 2019
- Overview of the Collate SQL command - October 22, 2019
- Recover a lost SA password - September 20, 2019