There are many new features in SQL Server 2016, but the one we will focus on in this post is:
- Temporal Database Tables
One of the many new features in SQL server 2016 is the built-in support for temporal data tables. This is a feature that many developers have been urging to have as a “out of the box feature” – and therefore many have developed their own ways of implementing the capability to store information about how data looked in a certain period of time.
What is a Temporal Data Table?
The concept of a temporal data table is that the table will hold information about a records value in any point in time – thus meaning that we easily can read from the table and get values that are not current, but merely a historical view of the record.
The table that is to hold temporal data, needs to contain system columns of datatype Datetime2, one holding the startdate and one holding the enddate, this will allow any application or a user to query the data and get values for a specific point in time.
Microsoft has decided to call temporal data tables “System-Versioned” tables when implementing them in SQL Server 2016.
Who needs Temporal Data Tables?
There is no such thing as data that are not dynamic, if data is not dynamic, the system/application is dead and we have other issues. All other applications or business users consuming data or creating data will know that they eventually will need to look at the historical data, some use cases for the temporal data tables would include:
- Understanding how the business changes over time
- Tracking data changes over time
- Auditing all changes to data
- Maintaining a slowly changing dimension for decision support applications
- Recovering from accidental data changes and application errors
- Only allowing part of the organization to look at updated records, whilst another look at the old version.
Business Intelligence
From a Business Intelligence developer viewpoint, this could solve some issues on collection historical data in i.e. Dimension Tables where many business analysts have a valid need to know how data looked at a point-in-time. This is especially an issue in regards to i.e. Organizational Dimensions and their hierarchy, which is bound to change over time. With the implementation of temporal data tables we get the feature out-of-the box, all we need to do is to configure the table.
Human Resource
Human resource data is by default Temporal, as an employee tends not to stay in the exact same state from day one, until either he/she is terminated or leaving for another job position. This is another clear use case for this feature.
How does it Work?
The way Microsoft have chosen to implement this design in SQL server 2016, is to have a new table nested under the table that we enable for system-versioning. So actually when you look at the table in SSMS, there will be a little icon on the left indicating that there is more to this table, as you can see in the figure 1.
Figure 1: Table indicating that system-versioning is enabled
Once we expand the table information, we discover that there is a new table under the hood, this table is named history, but can be named whatever you prefer. The history table will hold information on attributes from the main table, as well as a start and end date for the period, where the value in the attribute is valid. This is shown in figure 2.
Figure 2:The table information is now expanded
This means that whenever there is a change to any attributes in the main table, a copy of the old values paired with a start and an end date will be inserted into the history table. Thus giving you or the application the ability to look at data from the past.
Insert
When inserting new data in a table that is system-versioned, rows will be given the default enddate “9999-12-31 23:59:59” upon insertion, this will allow the server to maintain the data and always have the current dataset in the table at all times. Basically means that the record is open and valid.
Updates
When updating a record on a system-versioned table, the startdate will be updated to the UTC time from the transaction running the update statement. The enddate will be “9999-12-31 23:59:59” and the old attribute values will be inserted into the history table, with a enddate set to UTC time from the transaction running the update statement and the startdate will remain.
Deletes
When deleting a row in a system-versioned table, the current values will be stored in the history table, and the end date will be set to UTC time from the transaction running the delete statement. In the main table the row will be removed, and thus not show up in the datasets from that table. If we would like to see the data, we need to query the history table.
How to enable Temporal (system-versioned) on a table
How to get a table to be system-versioned is something that we have to script at the moment, and it’s a fairly easy task to script. The table itself need to have a start and end date defined before we can enable the feature on the table.
We will create the Product table that will have system-version enabled from the start:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
-- To create a new Temporal Data Table (System-Versioned) -- we need to issue the following statement. -- We asumme that the table is not created and does not exists. CREATE TABLE dbo.Product ( Product_id INT NOT NULL IDENTITY(1,1) CONSTRAINT PK_Product PRIMARY KEY NONCLUSTERED, Category_id INT NULL, Product_Name VARCHAR(255) NOT NULL, startdate DATETIME2(0) GENERATED ALWAYS AS ROW START NOT NULL, enddate DATETIME2(0) GENERATED ALWAYS AS ROW END NOT NULL, PERIOD FOR SYSTEM_TIME (startdate, enddate) ) WITH ( SYSTEM_VERSIONING = ON ( HISTORY_TABLE = dbo.ProductHistory ) ); |
Besides the ordinary table creation statement, we have added a few new lines to the create statement, the lines are these, and they will be explained.
1 |
startdate DATETIME2(0) GENERATED ALWAYS AS ROW START NOT NULL |
This line declares that startdate attribute, should be of type datetime2 and always generated when data is either inserted, updated or deleted. Also we indicate that it should be of the type ROW START.
1 |
enddate DATETIME2(0) GENERATED ALWAYS AS ROW END NOT NULL |
This line declares that enddate attribute, should be of type datetime2 and always generated when data is either inserted, updated or deleted. Also we indicate that it should be of the type ROW END.
1 |
PERIOD FOR SYSTEM_TIME (startdate, enddate) |
Now this last statement is the PERIOD that tells us which attributes are to be handled as a start and an end of the period.
The table have the startdate (startdate) and enddate (enddate) defined, and upon creation the historytable called [dbo].[Producthistory] is created.
Now the table is ready for the data let us insert some test data into it:
1 2 3 4 5 6 7 8 9 |
--Insert data into dbo.Product INSERT INTO dbo.Product (Category_id, Product_Name) VALUES (1, 'Toys') ,(2, 'Bikes') ,(3, 'Clothes') ,(4, 'Cars') |
Now if we select the data from product, we will have the following dataset at hand:
Product_id | Category_id | Product_Name | startdate | enddate |
1 | 1 | Toys | 2015-06-27 14:21:24 | 9999-12-31 23:59:59 |
2 | 2 | Bikes | 2015-06-27 14:21:24 | 9999-12-31 23:59:59 |
3 | 3 | Clothes | 2015-06-27 14:21:24 | 9999-12-31 23:59:59 |
4 | 4 | Cars | 2015-06-27 14:21:24 | 9999-12-31 23:59:59 |
As you can see all records is the current ones, indicated by the enddate set to “9999-12-31 23:59:59”
If we query the table holding the history, it will give us an empty dataset, because none of the original records has been updated or deleted.
Now, let us try to update a record, and see what happens to our data:
1 2 3 4 5 |
--Update a row in dbo.product UPDATE dbo.Product SET Product_Name = 'Christmas Accesories' WHERE Product_id = 4 |
When we again select from our table dbo.Product, we can see that the record in fact have changed. But if we look closely at the record we updated, you can see that the startdate has changed from the initial “2015-06-27 14:21:24” to “2015-06-27 14:31:32” – this means that the value for this exact record is valid in a period of 10 minutes:
Product_id | Category_id | Product_Name | startdate | enddate |
1 | 1 | Toys | 2015-06-27 14:21:24 | 9999-12-31 23:59:59 |
2 | 2 | Bikes | 2015-06-27 14:21:24 | 9999-12-31 23:59:59 |
3 | 3 | Clothes | 2015-06-27 14:21:24 | 9999-12-31 23:59:59 |
4 | 4 | Christmas Accesories | 2015-06-27 14:31:32 | 9999-12-31 23:59:59 |
If we look at the dbo.ProductHistory table, we can now see that a new record has been inserted, and here we will see that a value for a record with an id = 4 will be “Cars” in a period of 10 minutes:
Product_id | Category_id | Product_Name | startdate | enddate |
4 | 4 | Cars | 2015-06-27 14:21:24 | 2015-06-27 14:31:32 |
This means that we can query our history tables if we need to know an exact value for a specific point in time.
- Using SQL Server 2016 CTP3 in Azure - November 6, 2015
- New Features in SQL Server 2016 – Dynamic Data Masking - July 23, 2015
- New Features in SQL Server 2016 – Always encrypted - July 8, 2015