Introduction
There is a common need in reporting to aggregate or return data that is crunched based on date attributes. These may include weekdays, holidays, quarters, or time of year. While any of this information can be calculated on the fly, a calendar table can save time, improve performance, and increase the consistency of data returned by our important reporting processes. In my previous article, you could learn about designing of a calendar table.
Implementing a Calendar Table
When the design process for a calendar table is complete, we can begin creating our date-related data. In this article, we will run through the remaining TSQL needed to create rows in Dim_Date, add holiday metrics, and demo a few uses of the data. As has been the theme thus far, creativity, flexibility, and customization are key to making this a success. Only use data elements you need, and feel free to add more as needed.
At this point, let’s create a row in Dim_Date for the current date being processed:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
INSERT INTO dbo.Dim_Date (Calendar_Date, Calendar_Date_String, Calendar_Month, Calendar_Day, Calendar_Year, Calendar_Quarter, Day_Name, Day_of_Week, Day_of_Week_in_Month, Day_of_Week_in_Year, Day_of_Week_in_Quarter, Day_of_Quarter, Day_of_Year, Week_of_Month, Week_of_Quarter, Week_of_Year, Month_Name, First_Date_of_Week, Last_Date_of_Week, First_Date_of_Month, Last_Date_of_Month, First_Date_of_Quarter, Last_Date_of_Quarter, First_Date_of_Year, Last_Date_of_Year, Is_Holiday, Is_Holiday_Season, Holiday_Name, Holiday_Season_Name, Is_Weekday, Is_Business_Day, Previous_Business_Day, Next_Business_Day, Is_Leap_Year, Days_in_Month) SELECT @Date_Counter AS Calendar_Date, @Calendar_Date_String AS Calendar_Date_String, @Calendar_Month AS Calendar_Month, @Calendar_Day AS Calendar_Day, @Calendar_Year AS Calendar_Year, @Calendar_Quarter AS Calendar_Quarter, @Day_Name AS Day_Name, @Day_of_Week AS Day_of_Week, @Day_of_Week_in_Month AS Day_of_Week_in_Month, @Day_of_Week_in_Year AS Day_of_Week_in_Year, @Day_of_Week_in_Quarter AS Day_of_Week_in_Quarter, @Day_of_Quarter AS Day_of_Quarter, @Day_of_Year AS Day_of_Year, @Week_of_Month AS Week_of_Month, @Week_of_Quarter AS Week_of_Quarter, @Week_of_Year AS Week_of_Year, @Month_Name AS Month_Name, @First_Date_of_Week AS First_Date_of_Week, @Last_Date_of_Week AS Last_Date_of_Week, @First_Date_of_Month AS First_Date_of_Month, @Last_Date_of_Month AS Last_Date_of_Month, @First_Date_of_Quarter AS First_Date_of_Quarter, @Last_Date_of_Quarter AS Last_Date_of_Quarter, @First_Date_of_Year AS First_Date_of_Year, @Last_Date_of_Year AS Last_Date_of_Year, 0 AS Is_Holiday, 0 AS Is_Holiday_Season, NULL AS Holiday_Name, NULL AS Holiday_Season_Name, @Is_Weekday AS Is_Weekday, @Is_Business_Day AS Is_Business_Day, -- Will be populated with weekends to start. NULL AS Previous_Business_Day, NULL AS Next_Business_Day, @Is_Leap_Year AS Is_Leap_Year, @Days_in_Month AS Days_in_Month SELECT @Date_Counter = DATEADD(DAY, 1, @Date_Counter); END |
Here we insert a row into our calendar table, increment @Date_Counter, and go back to the start of our loop. Some columns are left NULL intentionally, as we will be populating them below. Holidays and business days can be calculated in a set-based fashion once all of our other data has been inserted into the calendar table.
Holiday calculations are completely up to you. Include whatever holidays are relevant, needed for reporting, or helpful in understanding business activity. If holidays are not needed then you may leave out those columns, as well as much of the remaining code in this article. For the examples below, we’ve included wide variety of holidays that are calculated in different ways. For example, President’s day is on the 3rd Monday in February, which can be determined using the month (2), day of week (Monday), and day-of-week-in-month (3). Alternatively, US independence day is simple as it can be determined using only the month (7) and day (4).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 |
-- New Year's Day: 1st of January UPDATE Dim_date SET Is_Holiday = 1, Holiday_Name = 'New Year''s Day', Is_Business_Day = 0 FROM dbo.Dim_date WHERE Dim_Date.Calendar_Month = 1 AND Dim_Date.Calendar_Day = 1 AND Dim_date.Calendar_Date BETWEEN @Start_Date AND @End_Date; -- Martin Luther King, Jr. Day: 3rd Monday in January, beginning in 1983 UPDATE Dim_date SET Is_Holiday = 1, Holiday_Name = 'Martin Luther King, Jr. Day', Is_Business_Day = 0 FROM dbo.Dim_date WHERE Dim_Date.Calendar_Month = 1 AND Dim_Date.Day_of_Week = 2 AND Dim_Date.Day_of_Week_in_Month = 3 AND Dim_date.Calendar_Year >= 1983 AND Dim_date.Calendar_Date BETWEEN @Start_Date AND @End_Date; -- President's Day: 3rd Monday in February UPDATE Dim_date SET Is_Holiday = 1, Holiday_Name = 'President''s Day', Is_Business_Day = 0 FROM dbo.Dim_date WHERE Dim_Date.Calendar_Month = 2 AND Dim_Date.Day_of_Week = 2 AND Dim_Date.Day_of_Week_in_Month = 3 AND Dim_date.Calendar_Date BETWEEN @Start_Date AND @End_Date; -- Valentine's Day: 14th of February UPDATE Dim_date SET Is_Holiday = 1, Holiday_Name = 'Valentine''s Day' FROM dbo.Dim_date WHERE Dim_Date.Calendar_Month = 2 AND Dim_Date.Calendar_Day = 14 AND Dim_date.Calendar_Date BETWEEN @Start_Date AND @End_Date; -- Saint Patrick's Day: 17th of March UPDATE Dim_date SET Is_Holiday = 1, Holiday_Name = 'Saint Patrick''s Day' FROM dbo.Dim_date WHERE Dim_Date.Calendar_Month = 3 AND Dim_Date.Calendar_Day = 17 AND Dim_date.Calendar_Date BETWEEN @Start_Date AND @End_Date; -- Mother's Day: 2nd Sunday in May UPDATE Dim_date SET Is_Holiday = 1, Holiday_Name = 'Mother''s Day' FROM dbo.Dim_date WHERE Dim_Date.Calendar_Month = 5 AND Dim_Date.Day_of_Week = 1 AND Dim_Date.Day_of_Week_in_Month = 2 AND Dim_date.Calendar_Date BETWEEN @Start_Date AND @End_Date; -- Memorial Day: Last Monday in May UPDATE Dim_date SET Is_Holiday = 1, Holiday_Name = 'Memorial Day', Is_Business_Day = 0 FROM dbo.Dim_date WHERE Dim_Date.Calendar_Month = 5 AND Dim_Date.Day_of_Week = 2 AND Dim_Date.Day_of_Week_in_Month = (SELECT MAX(Dim_Date_Memorial_Day_Check.Day_of_Week_in_Month) FROM dbo.Dim_Date Dim_Date_Memorial_Day_Check WHERE Dim_Date_Memorial_Day_Check.Calendar_Month = Dim_Date.Calendar_Month AND Dim_Date_Memorial_Day_Check.Day_of_Week = Dim_Date.Day_of_Week AND Dim_Date_Memorial_Day_Check.Calendar_Year = Dim_Date.Calendar_Year) AND Dim_date.Calendar_Date BETWEEN @Start_Date AND @End_Date; -- Father's Day: 3rd Sunday in June UPDATE Dim_date SET Is_Holiday = 1, Holiday_Name = 'Father''s Day' FROM dbo.Dim_date WHERE Dim_Date.Calendar_Month = 6 AND Dim_Date.Day_of_Week = 1 AND Dim_Date.Day_of_Week_in_Month = 3 AND Dim_date.Calendar_Date BETWEEN @Start_Date AND @End_Date; -- Independence Day (USA): 4th of July UPDATE Dim_date SET Is_Holiday = 1, Holiday_Name = 'Independence Day (USA)', Is_Business_Day = 0 FROM dbo.Dim_date WHERE Dim_Date.Calendar_Month = 7 AND Dim_Date.Calendar_Day = 4 AND Dim_date.Calendar_Date BETWEEN @Start_Date AND @End_Date; -- Labor Day: 1st Monday in September UPDATE Dim_date SET Is_Holiday = 1, Holiday_Name = 'Labor Day', Is_Business_Day = 0 FROM dbo.Dim_date WHERE Dim_Date.Calendar_Month = 9 AND Dim_Date.Day_of_Week = 2 AND Dim_Date.Day_of_Week_in_Month = 1 AND Dim_date.Calendar_Date BETWEEN @Start_Date AND @End_Date; -- Columbus Day: 2nd Monday in October UPDATE Dim_date SET Is_Holiday = 1, Holiday_Name = 'Columbus Day', Is_Business_Day = 0 FROM dbo.Dim_date WHERE Dim_Date.Calendar_Month = 10 AND Dim_Date.Day_of_Week = 2 AND Dim_Date.Day_of_Week_in_Month = 2 AND Dim_date.Calendar_Date BETWEEN @Start_Date AND @End_Date; -- Halloween: 31st of October UPDATE Dim_date SET Is_Holiday = 1, Holiday_Name = 'Halloween' FROM dbo.Dim_date WHERE Dim_Date.Calendar_Month = 10 AND Dim_Date.Calendar_Day = 31 AND Dim_date.Calendar_Date BETWEEN @Start_Date AND @End_Date; -- Veteran's Day: 11th of November UPDATE Dim_date SET Is_Holiday = 1, Holiday_Name = 'Veteran''s Day', Is_Business_Day = 0 FROM dbo.Dim_date WHERE Dim_Date.Calendar_Month = 11 AND Dim_Date.Calendar_Day = 11 AND Dim_date.Calendar_Date BETWEEN @Start_Date AND @End_Date; -- Thanksgiving: 4th Thursday in November UPDATE Dim_date SET Is_Holiday = 1, Holiday_Name = 'Thanksgiving', Is_Business_Day = 0 FROM dbo.Dim_date WHERE Dim_Date.Calendar_Month = 11 AND Dim_Date.Day_of_Week = 5 AND Dim_Date.Day_of_Week_in_Month = 4 AND Dim_date.Calendar_Date BETWEEN @Start_Date AND @End_Date; -- Election Day (USA): 1st Tuesday after November 1st, only in even-numbered years. Always in the range of November 2-8. UPDATE Dim_date SET Is_Holiday = 1, Holiday_Name = 'Election Day (USA)' FROM dbo.Dim_date WHERE Dim_Date.Calendar_Month = 11 AND Dim_Date.Day_of_Week = 3 AND Dim_Date.Calendar_Day BETWEEN 2 AND 8 AND Dim_date.Calendar_Date BETWEEN @Start_Date AND @End_Date; -- Christmas: 25th of December UPDATE Dim_date SET Is_Holiday = 1, Holiday_Name = 'Christmas', Is_Business_Day = 0 FROM dbo.Dim_date WHERE Dim_Date.Calendar_Month = 12 AND Dim_Date.Calendar_Day = 25 AND Dim_date.Calendar_Date BETWEEN @Start_Date AND @End_Date; |
Note that each holiday definition not only designates and names a holiday, but adjusts Is_Business_Day accordingly. This allows us to determine for each holiday whether it is a business day (do nothing) or not (set it to zero). We also constrain holiday assignments to the date range provided in the stored procedure parameters.
Also note that these are all of the literal holidays as they are defined. Oftentimes, holidays that happen to fall on weekends will have an observed national holiday on Monday. This is not accounted for here, but could easily be adjusted for later on if needed. For example, we could search the table for any instances of Christmas that fall on Saturday or Sunday and immediately change the holiday to be on Monday, and append “Observed” to the holiday name. It’s an extra step that could be applied in a set-based fashion to any group of holidays, incrementing 2 days if Saturday and 1 day if Sunday.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
WITH CTE_Business_Days AS ( SELECT Business_Days.Calendar_Date FROM dbo.Dim_Date Business_Days WHERE Business_Days.Is_Business_Day = 1 ) UPDATE Dim_Date_Current SET Previous_Business_Day = CTE_Business_Days.Calendar_Date FROM dbo.Dim_Date Dim_Date_Current INNER JOIN CTE_Business_Days ON CTE_Business_Days.Calendar_Date = (SELECT MAX(Previous_Business_Day.Calendar_Date) FROM CTE_Business_Days Previous_Business_Day WHERE Previous_Business_Day.Calendar_Date < Dim_Date_Current.Calendar_Date) WHERE Dim_Date_Current.Calendar_Date BETWEEN @Start_Date AND @End_Date; WITH CTE_Business_Days AS ( SELECT Business_Days.Calendar_Date FROM dbo.Dim_Date Business_Days WHERE Business_Days.Is_Business_Day = 1 ) UPDATE Dim_Date_Current SET Next_Business_Day = CTE_Business_Days.Calendar_Date FROM dbo.Dim_Date Dim_Date_Current INNER JOIN CTE_Business_Days ON CTE_Business_Days.Calendar_Date = (SELECT MIN(Next_Business_Day.Calendar_Date) FROM CTE_Business_Days Next_Business_Day WHERE Next_Business_Day.Calendar_Date > Dim_Date_Current.Calendar_Date) WHERE Dim_Date_Current.Calendar_Date BETWEEN @Start_Date AND @End_Date; |
With holidays and business days defined, we can use (somewhat messy) common table expressions to determine the previous and next business days. These calculations can be very useful in understanding how business reacts to the start or end of a streak of business days or non-business days. For example, does work pile up over long weekends, resulting in an influx of business on the next work day?
Holiday seasons are another consideration that may be useful when determining how business reacts to the lead-up to a holiday, a season, or proximity to a holiday. For this example, we’ll populate the Christmas holiday season:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
WITH CTE_Thanksgiving AS ( SELECT Dim_date.Calendar_Date AS Thanksgiving_Date FROM dbo.Dim_date WHERE Dim_date.Holiday_Name = 'Thanksgiving' ) UPDATE Dim_date SET Is_Holiday_Season = 1 FROM dbo.Dim_date INNER JOIN CTE_Thanksgiving ON DATEPART(YEAR, CTE_Thanksgiving.Thanksgiving_Date) = DATEPART(YEAR, Dim_date.Calendar_Date) WHERE (Dim_Date.Calendar_Month = 11 AND Dim_Date.Calendar_Date >= CTE_Thanksgiving.Thanksgiving_Date) OR (Dim_Date.Calendar_Month = 12 AND Dim_Date.Calendar_Day < 25); END GO |
In this TSQL, we define Thanksgiving and use that date to create a sequence of dates between it and Christmas, which becomes our holiday season. Seasons can vary widely based on business needs, encompassing times of year, holidays, weather, or special events. If desired, we could add and populate a day of season column, in order to track how many days into a given season we are (or how many remain).
That’s the end of the proc! Now, let’s test it out:
1 2 3 4 5 |
EXEC dbo.Populate_Dim_Date @Start_Date = '1/1/2015', -- Start of date range to process @End_Date = '1/1/2030'; -- End of date range to process |
This takes about seven seconds to run and generates 5480 rows of date data for our use. The results look like this:
All scripts in this article are attached at the end to allow for easy tinkering. Feel free to download and modify to your heart’s content!
Customization
A calendar table can be customized and adjusted to meet your business needs. If ¾ of these columns are unnecessary, then feel free to remove them. If there are additional metrics that are useful to your applications or reports, then feel free to add them! If a piece of data can be derived from a date, then it might be useful here.
The purpose of a calendar table is to improve the efficiency of reports or processes that require calculations to be made based on date components. In addition, we centralize the calendar data, which ensures consistency across any code that requires it. This is preferable to calculating these metrics on-the-fly every time we need them.
Performance
In our examples, we generated quite a few different columns, but you’re free to use as many or as few as makes sense for your application. This is a scenario in which there is no need to fear a wide table as most metrics will be relatively small (such as BIT, DATE, or TINYINT). Additionally, we have total control over the row count in a calendar table. For example, 75 years of data would result in a relatively lean 365 * 75 + 19 (27,394) rows of data (one row per date).
This is a reporting table that we can add more rows to later at a future time. That being said, if we intentionally only include a few years, we need to remember to add more in the future, or reports may stop functioning correctly. In addition, for forecasting and predictive analytics, we may need to span many years in the future, even if our current data set does not extend that far. Of course, if we know for certain that we will never need data before or after any given date, then by all means include only the data that is needed.
With regards to data types, be sure to choose the smallest possible types for a given column. DATE is preferable to DATETIME, BIT smaller than INT, and consider using TINYINT or SMALLINT instead of INT. For example, the day number within a given year will never be larger than 366 or smaller than 1, therefore a SMALLINT is more than enough for this column.
Using a Calendar Table in Reporting
Using a calendar table optimally requires that we add a date key to reporting tables, index, and join the calendar table on that column. To test this, let’s use Sales.SalesOrderHeader in AdventureWorks:
1 2 3 4 |
ALTER TABLE sales.SalesOrderHeader ADD Order_Date_Key AS CAST(OrderDate AS DATE); CREATE NONCLUSTERED INDEX IX_SalesOrderHeader_Order_Date_Key ON sales.SalesOrderHeader(Order_Date_Key); |
This creates a date key for joining om SalesOrderHeader and indexes it.
1 2 3 4 5 6 7 |
SELECT * FROM sales.SalesOrderHeader INNER JOIN dbo.Dim_Date ON SalesOrderHeader.Order_Date_Key = Dim_Date.Calendar_Date; |
This returns no results (!?):
The calendar data we created spanned 2015 – 2030, but the data in SalesOrderHeader is older. As a result, our INNER JOIN returns no matches and we get nothing back. To resolve this, we’ll add more data to our calendar table:
1 2 3 4 5 |
EXEC dbo.Populate_Dim_Date @Start_Date = '1/1/2000', -- Start of date range to process @End_Date = '12/31/2014'; -- End of date range to process |
This runs in about a second, and when we rerun our SELECT, we get results:
Now, let’s try out a few queries:
1 2 3 4 5 6 7 8 9 10 11 12 |
SELECT SalesOrderHeader.TerritoryID, SUM(SalesOrderHeader.SubTotal) AS Sub_Total, COUNT(*) AS Number_of_Orders FROM sales.SalesOrderHeader INNER JOIN dbo.Dim_Date ON SalesOrderHeader.Order_Date_Key = Dim_Date.Calendar_Date WHERE Dim_Date.Holiday_Name = 'Christmas' GROUP BY SalesOrderHeader.TerritoryID ORDER BY SUM(SalesOrderHeader.SubTotal) DESC; |
This tells us which territories were responsible for the most sales on Christmas, ordered by the most on top and fewest on the bottom:
1 2 3 4 5 6 7 8 9 10 11 12 |
SELECT Dim_Date.Calendar_Month, SUM(SalesOrderHeader.SubTotal) AS Monthly_Order_Total, COUNT(*) AS Monthly_Order_Count FROM sales.SalesOrderHeader INNER JOIN dbo.Dim_Date ON SalesOrderHeader.Order_Date_Key = Dim_Date.Calendar_Date WHERE Dim_Date.Day_Name = 'Sunday' GROUP BY Dim_Date.Calendar_Month ORDER BY Dim_Date.Calendar_Month; |
This query groups orders by month, so we can see which months have the most and fewest orders:
These may seem like simple examples, but many reporting and analytics programs expect simple dimensions for consumption. Having the month number or day name is often required in order to perform any calculations using those columns. If we decide to take things further and crunch more complex uses of date components, then having the dimensions pre-calculated will save immense time and ensure that work is possible, let along efficient.
Conclusion
Calendar tables are extremely useful in any reporting, analytics, or even OLTP use case in which we need to frequently join data on data-related attributes. Not only can they greatly improve performance, but they simplify our code and allow reporting engines to consume that data with ease. As a bonus, we gain maintainability as we can retain a single copy of calendar data in one place. This reduces the likelihood of coding mistakes when operating on date data, especially when the calculations are complex.
One area of calendar tables that was intentionally omitted from this was alternate calendars. There are many uses for the ISO, 4-5-4, and a variety of fiscal calendars that can be joined into our standard calendar data. We’ll tackle this in a future article and build on the work we have completed here!
Other articles in this series:
References and Further Reading
- This provides a complete list of date-related data types and functions, which can be used for generating date dimensions: Date and Time Data Types and Functions (Transact-SQL)
- Some notes on calendar use in SSAS 2016 and later: Create a Date type Dimension
- Details on all of SQL Server’s data types. Consider the smallest data type that fully covers a given use case. This saves space and improves performance! Data Types (Transact-SQL)
- Notes on common table expressions (CTEs), which are used for populating some of the data in these demos. Using Common Table Expressions
You can download all scripts in this article here.
Always check your results and make sure that calendar data is correct. It’s easy to update and replace, so do not hesitate to apply sufficient scrutiny as this data is intended to be used in many places.
- SQL Server Database Metrics - October 2, 2019
- Using SQL Server Database Metrics to Predict Application Problems - September 27, 2019
- SQL Injection: Detection and prevention - August 30, 2019