April 1, 2009

Using Database_Default When Specifying SQL Server Collation

In SQL Server, collation refers to the set of rules that determine how data is compared and sorted.  Are you mindful of your database collation?  You should be.  Even if you don’t need to support multiple collations right now, it’s easier to choose a collation and establish your collation policy early on than to go back and update a bunch of database objects later.  Trust me.  We had to go back and apply consistent collation usage to all of our client databases, which required writing a script that dropped constraints, indexes, and views, disabled triggers, set the collation for the database, updated all character data type columns on all tables, and then recreated and re-enabled everything that was dropped or disabled.  Not fun.

Collation can be specified:

  • at the server level when installing or updating SQL Server;
  • when creating or altering a database;
  • when creating or altering table columns; or
  • when casting the collation of an expression. 

However, it’s often easiest to just specify the collation at the database level, and then have everything else in the database inherit that collation.  That way you have consistent collation usage throughout your database.

To do so, database objects (including tables, functions, procedures, views, and triggers) that have character data type columns (meaning char, nchar, varchar, nvarchar, text, and ntext columns) should specify a collation of DATABASE_DEFAULT. The DATABASE_DEFAULT option indicates that a column should use the default collation of the current user database. 

For example, create a database and specify its collation:

CREATE DATABASE [TestDatabase] COLLATE SQL_Latin1_General_CP1_CI_AI 

And then you can use the DATABASE_DEFAULT keyword when creating a character data type column to have that column inherit the database’s collation: 

CREATE TABLE [dbo].[TestTable]
(
[ID] [bigint] IDENTITY(1,1) NOT NULL,
[Col1] [char](10) COLLATE DATABASE_DEFAULT NOT NULL,
[Col2] [ntext] COLLATE DATABASE_DEFAULT NULL
)

Omitting collation declarations is not terribly significant for table columns, as these columns will use the default collation of the database.  Omitting collation declarations for temp tables and table variables, however, is significant, as the columns for these tables will then use the default collation of tempdb, which uses the default collation of the server – and the server might have a different default collation than the current database has.

Handling collation in a consistent manner throughout your database is important.  At work we even took the extra step of writing a Subversion hook that verifies every file being committed that has a table, stored procedure, view, etc. includes the “COLLATE DATABASE_DEFAULT” statement for all character data type columns.  While you may not find this extra step necessary, at the very least be sure to specify the collation wherever necessary.

1 comments:

  1. This is such a PITA to maintain! Why can it be as simple as setting each user's default collation for a database (tempdb included)?!
    ReplyDelete