Working with Temporal Tables in SQL Server 2016 (Part 2)

In my previous post, I introduced the concept of temporal data, and explained at a high level how SQL Server 2016 implements temporal tables. This post dives into the details of exactly how you create and query temporal tables.

Let’s start with an ordinary table, and convert it into a temporal table. So I’ll create the Employee table, and load it up with some data.

Temporal01

Temporal02

To convert this into a temporal table, first I’ll add the two period columns and then I’ll enable temporal and set dbo.EmployeeHistory as the name of the history table.

Temporal03

Note that because we’re converting an existing table, this must be done in two separate ALTER TABLE statements. For a new temporal table, you can create it and enable it with a single CREATE TABLE statement. Also, and because this is an existing table with existing data, it’s necessary to set DEFAULT values that initialize the period columns with beginning-of-time (1900-01-01 00:00:00.0000000) until end-of-time (9999-12-31 23:59:59.9999999) values, which is not be necessary when creating a new table, or altering an existing table with no rows.

Now when we expand to see the Employee table in the Object Explorer, you can see that it is being designated as a System-Versioned table, which is the official name for temporal tables in SQL Server 2016. It also has a special icon showing a clock over the table, and this tells you that you’re looking at a temporal table. And nested directly beneath, is the history table, dbo.EmployeeHistory, which SQL Server has created with an identical schema to match dbo.Employee.

Temporal04

To effectively demonstrate how to query temporal tables, we’ll need to change some data first. You can see the current state of the data in the Employee table just as we inserted it, while the history table is completely empty because we haven’t made any changes yet:

Temporal05

Temporal06

So let’s update and delete a few rows. I’ll update the same row for EmployeeID number 5 three times, with a 5 or 6 second pause in between each one. First I’ll change the employee name to Gabriel, then I’ll change the department name to Support, and then I’ll change the department name once more to Executive. And, I’ll also delete EmployeeID number 8:

Temporal07

Now I’ll query the tables again:

Temporal08

The Employee table shows the current state just as you’d expect, with EmployeeID number 5 showing the result of the latest UPDATE, and EmployeeID number 8 being deleted. But if you examine the history table, you can see the three earlier versions of employee ID number 5, corresponding with each of the three updates we ran against that row. And we also see EmployeeID number 8, which we deleted.

Let’s have a closer look at the period columns. In the main table, you can see that every row except for EmployeeID number five has period column values that span from the beginning of time until the end of time. But EmployeeID number five has a StartDate indicating that this version of the row began on August 14 at about 5:35pm. This tells you that there are earlier versions of this row available in the history table.

Temporal09

And indeed, the earlier versions are visible here in the history table. Specifically, the previous version of this row is the one that ends at the same time that the current version begins, August 14th at 5:35 and 25 seconds, which is when we changed the DepartmentName from Support to Executive. This ending datetime2 value of 25 seconds past the minute matches exactly with the starting datetime2 value in the current row:

Temporal10

The previous version also was also replaced by an even earlier version about five seconds earlier, at 20 seconds past the minute, when we changed the department name from Engineering to Support:

Temporal11

And there’s one earlier version of the row from about 6 seconds earlier, at 14 seconds past the minute, which is when we changed the FirstName from Gail to Gabriel (this is clearly the very first version of the row, as indicated by its start date of 1900-01-01 00:00:00.0000000):

Temporal12

We made our changes only 5 or 6 seconds apart, but to effectively demonstrate temporal, we’ll need to tweak that to simulate longer periods of time between changes. Now caution here, you normally wouldn’t ever touch these columns, and in fact, as long as temporal is enabled, SQL Server won’t permit you to change them. So for demonstration purposes only, I’ll disable temporal, adjust the period columns, and then re-enable temporal. This must be done very carefully so that the start and end dates of each version continue to match up exactly as we’ve seen:

Temporal13

In this code, we first alter the table to disable temporal. Next we update the history table period columns so that the first change, when the FirstName was Gail, actually occurred 35 days ago, and we adjust the EndDate accordingly. The second change, when the FirstName was Gabriel and the DepartmentName was Engineering, let’s say actually occurred 25 days ago. So we adjust the EndDate accordingly, but also the StartDate so that it aligns with the first update from 35 days ago. And the third change, when the FirstName was Gabriel and the DepartmentName was Support occurred just now, so we leave its EndDate alone and only adjust the StartDate so that it aligns with the second update from 25 days ago. We also tweak the deleted row for EmployeeID number 8 so that it appears this row was deleted 25 days ago. Finally, we re-enable temporal on the Employee table.

When we look at the data now, it really does look like this table has been around for a while. I’m running this demo today on August 14th, but it’s showing changes from as far back as 35 days ago – where it appears that the first change was made on July 10th. You can also see that our updates have maintained the seamless connection of start and end dates across multiple versions of the same row:

Temporal14

Now we have a table with a history of changes over the past 35 days. And the beauty of temporal is that we can instantly query this table as it appeared at any time during that period. All you do is issue an ordinary SELECT statement, and include the special syntax FOR SYSTEM_TIME AS OF.

So I’ll run four queries. The first one is an ordinary SELECT that just queries over the current version of the table. But the other three include FOR SYSTEM_TIME AS OF to query the table as it appeared two minutes ago, thirty days ago, and forty days ago:

Temporal15

Let’s inspect the results.

The first query shows current data only, so we see the latest version of EmployeeID number 5, with FirstName Gabriel, and the Department is Executive. We’re also missing EmployeeID number 8, because it’s been deleted:

Temporal16

The second query from two minutes ago shows the same row for Employee ID number 5, but the Department is Support, and this is because we only just recently changed the Department from Support to Executive. But EmployeeID number 8 is still missing, because it was deleted more than just two minutes ago:

Temporal17

The third query is from thirty days ago, so we get an even earlier version of EmployeeID number 5, when the Department was still Engineering. And we also see the deleted row for EmployeeID number 8 miraculously reappear, because it was deleted 25 days ago, and these query results are from 30 days ago:

Temporal18

And finally, the fourth query from 40 days ago shows the very original version of the table before any updates or deletes. We see the original name Gail for EmployeeID number 5, as well as the original row for EmployeeID number 8 that got deleted later on:

Temporal19

As you can see, time travel with temporal tables is pretty awesome. But it doesn’t end here. Stay tuned for a future post that shows how to use the new stretch database feature to transparently migrate part or all of the temporal history table to the cloud on Azure SQL Database.

Until then, happy coding!

Introducing Temporal Tables in SQL Server 2016 (Part 1)

SQL Server 2016 introduces System Version Tables, which is the formal name for the long awaited temporal data feature. In this blog post (part 1) I’ll explain what temporal is all about, and part 2 dives into greater detail on temporal with demos.

Overview

Temporal means, time-related, and in the case of SQL Server, this means that you get point-in-time access to a table, allowing you to query not only the table’s current data, but data as it appeared in the table at any past point in time. So data that you overwrite with one or more update statements, or data that you blow away with a delete statement, is never really lost. It’s always and immediately available simply by telling your otherwise ordinary query to travel back in time when looking at the table.

The mechanism behind this magic is actually rather simple, and completely seamless. SQL Server automatically creates a history table with the same schema as the table enabled for temporal, records every update and delete into that history table along with timestamps for identifying each version of every update or delete. Then, the query engine integrates with the history table and gives you any desired point-in-time access to the temporal table.

Think of all the great uses for this feature.

  • Time travel
    • Being able to query data as it changes over time yields tremendous business value, where temporal tables make it very easy to perform all sorts of trend analysis against your data.
  • Slowly changing dimensions
    • This feature is also very handy when you’re incrementally building large data warehouses with slowly changing dimensions, because the history table always contains data changes that are timestamped.
  • Auditing
    • Temporal tables also give you an inherent auditing solution, when you need to track what data has changed, and when, although it won’t record who made the change.
  • Accidental data loss
    • You know that heart-dropping moment after an update or delete that you really didn’t mean? Well, rather than panic and scramble to find that backup and restore it, you can much more easily recover from your accident by accessing the lost data from the history table.

Using Temporal

It’s pretty easy to get going with temporal, because there are very few pre-requisites. Any table can become a temporal table as long as It has a primary key (which you have in virtually any table), as well as a pair of datetime2 columns, known as the period columns. Given those minimal requirements, you can turn any table into a system versioned table.

SQL Server creates the history table with a schema to match the main table, except that it does not enforce constraints on the history table. This makes sense if you think about it, because multiple versions of the same row, with the same primary key value, will be written to the history table for every change, and so it’s just not possible to enforce the uniqueness of the primary key in the history table.

After creating the history table, SQL Server automatically populates it and preserves the original version of any row affected by an update statement in the main table, as well as retaining any row that gets deleted from the main table. Now certainly, this is nothing that we couldn’t achieve ourselves by writing triggers to do the same thing. However, the real power of a temporal table comes into play at query time. Simply by including the additional syntax FOR SYSTEM_TIME AS OF with a specific point in time in your SELECT statement, SQL Server automatically executes your query against the table – as it appeared at that point in time.

Once enabled for temporal, you continue treating the table pretty much like an ordinary table. In most cases, you can even ALTER the table’s schema, and SQL Server will automatically reflect the schema change in the history table to keep it in sync. However, there are some types of schema changes that won’t be possible unless you first break the temporal connection between the main table and the history table, make the same schema change to both tables in exactly the same manner, and then re-establish the connection between them. Examples of schema changes that require these extra steps include adding an identity column, or a computed column.

Creating a Temporal Table

Here’s an example of a temporal table.

CREATE TABLE Department (
 DepartmentID    int NOT NULL IDENTITY(1,1) PRIMARY KEY,
  DepartmentName  varchar(50) NOT NULL,
  ManagerID       int NULL,
  ValidFrom       datetime2 GENERATED ALWAYS AS ROW START NOT NULL,
  ValidTo         datetime2 GENERATED ALWAYS AS ROW END   NOT NULL,
  PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo)
)
WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = DepartmentHistory))

It’s just like any other table, and includes the two period columns ValidFrom and ValidTo, although these columns can be named anything you like; you just need to add GENERATED ALWAYS AS ROW START and ROW END, and then reference the two columns with PERIOD FOR SYSTEM_TIME. That’s it for the table schema; to actually turn on temporal for the table, we add WITH SYSTEM_VERSIONING = ON, and also set the name for the history table that SQL Server should create, and that must include the schema name; although if you leave out the HISTORY_TABLE name, SQL Server will generate one based on the main table’s internal object ID.

And that’s all there is to it. You just continue working with the table as usual, and SQL Server captures data changes to the history table, and also maintains the date and time for each version of every row.

Querying a Temporal Table

And so, as a result, you can query the table as it appeared at any past point in time simply by including the FOR SYSTEM_TIME AS OF clause like you see here in this example, where the Employee table is being queried as it appeared exactly thirty days ago:

DECLARE @ThirtyDaysAgo datetime2 = DATEADD(d, -30, SYSDATETIME())

SELECT *
 FROM Employee
 FOR SYSTEM_TIME AS OF @ThirtyDaysAgo
 ORDER BY EmployeeId

So any rows that have been deleted in the past thirty days, they’ll be returned by this query. Any new rows created in the past thirty days? Those won’t be returned. And any rows older than thirty days that have been modified in the past thirty days are returned as they appeared exactly thirty days ago. And that’s the magic of temporal tables in SQL Server 2016.

If you want to learn more, check out part 2, Working with Temporal Tables in SQL Server 2016.