Date Dimensions

Hello All –

We have built a date dimension. Just wanted to share our code and results here and see if others have built their date dimensions differently, anything that you would add/remove, etc.?

{{
config(
pre_hook=“ALTER SESSION SET WEEK_OF_YEAR_POLICY=1, WEEK_START=7”
)
}}

WITH DATE_SPINE AS
(
{{ dbt_utils.date_spine(
datepart=“day”,
start_date=“to_date(‘01/01/2018’, ‘mm/dd/yyyy’)”,
end_date=“TO_DATE(CONCAT(YEAR(CURRENT_DATE) + 6,‘-12-31’))”
)
}}
)
SELECT
CAST(DATE_DAY AS DATE) AS DATE_KEY
,CAST(DAYOFYEAR(DATE_KEY) AS INT) AS DAY_OF_YEAR
,CAST(DECODE (RIGHT(EXTRACT(DAYOFMONTH FROM DATE_KEY), 1),
1 , ‘st’,
2 , ‘nd’,
3 , ‘rd’,
‘th’ – Default option, all other days ending aside from 1st, 2nd, and 3rd end in ‘th’
) AS VARCHAR(2)) AS DAY_SUFFIX
,CAST(YEAR(DATE_KEY) || RIGHT(‘0’ || WEEK(DATE_KEY), 2) AS INT) AS WEEK_KEY
,CAST(WEEKOFYEAR(DATE_KEY) AS INT) AS WEEK_OF_YEAR
,CAST(DAYOFWEEK(DATE_KEY) AS INT) AS DAY_OF_WEEK
,CAST(DAYNAME(DATE_KEY) AS VARCHAR(5)) AS WEEK_DAY_SHORT_DESC
,CAST(DECODE (EXTRACT(DAYOFWEEK FROM DATE_KEY),
1 , ‘Sunday’,
2 , ‘Monday’,
3 , ‘Tuesday’,
4 , ‘Wednesday’,
5 , ‘Thursday’,
6 , ‘Friday’,
7 , ‘Saturday’
) AS VARCHAR(9)) AS WEEK_DAY_DESC
,CAST(TRUNC(DATE_KEY, ‘Week’) AS DATE) AS FIRST_DAY_OF_WEEK
,CAST(LAST_DAY(DATE_KEY, ‘Week’) AS DATE) AS LAST_DAY_OF_WEEK
,CAST(YEAR(DATE_KEY) || RIGHT(‘0’ || MONTH(DATE_KEY), 2) AS INT) AS MONTH_KEY
,CAST(MONTH(DATE_KEY) AS INT) AS MONTH_OF_YEAR
,CAST(DAYOFMONTH(DATE_KEY) AS INT) AS DAY_OF_MONTH
,CAST(MONTHNAME(DATE_KEY) AS VARCHAR(5)) AS MONTH_SHORT_DESC
,CAST(TO_CHAR(DATE_KEY, ‘MMMM’) AS VARCHAR(50)) AS MONTH_DESC
,CAST(TRUNC(DATE_KEY, ‘Month’) AS DATE) AS FIRST_DAY_OF_MONTH
,CAST(LAST_DAY(DATE_KEY, ‘Month’) AS DATE) AS LAST_DAY_OF_MONTH
,CAST(YEAR(DATE_KEY) || QUARTER(DATE_KEY) AS INT) AS QUARTER_KEY
,CAST(QUARTER(DATE_KEY) AS INT) AS QUARTER_OF_YEAR
,CAST(DATE_KEY - TRUNC(DATE_KEY, ‘Quarter’) + 1 AS INT) AS DAY_OF_QUARTER
,CAST(‘Q’ || QUARTER_OF_YEAR AS VARCHAR(5)) AS QUARTER_SHORT_DESC
,CAST('Quarter ’ || QUARTER_OF_YEAR AS VARCHAR(50)) AS QUARTER_DESC
,CAST(TRUNC(DATE_KEY, ‘Quarter’) AS DATE) AS FIRST_DAY_OF_QUARTER
,CAST(LAST_DAY(DATE_KEY, ‘Quarter’) AS DATE) AS LAST_DAY_OF_QUARTER
,CAST(YEAR(DATE_KEY) AS INT) AS YEAR_KEY
,CAST(TRUNC(DATE_KEY, ‘Year’) AS DATE) AS FIRST_DAY_OF_YEAR
,CAST(LAST_DAY(DATE_KEY, ‘Year’) AS DATE) AS LAST_DAY_OF_YEAR
,CAST(ROW_NUMBER() OVER (PARTITION BY YEAR(DATE_KEY), MONTH(DATE_KEY), DAYOFWEEK(DATE_KEY) ORDER BY DATE_KEY) AS INT) AS ORDINAL_WEEKDAY_OF_MONTH
,CAST(CASE
WHEN MONTH(DATE_KEY) = 1 AND DAYOFMONTH(DATE_KEY) = 1
THEN ‘New Year’‘s Day’
WHEN MONTH(DATE_KEY) = 1 AND DAYOFMONTH(DATE_KEY) = 20 AND ((YEAR(DATE_KEY) - 1) % 4) = 0
THEN ‘Inauguration Day’
WHEN MONTH(DATE_KEY) = 1 AND DAYOFWEEK(DATE_KEY) = 2 AND ORDINAL_WEEKDAY_OF_MONTH = 3
THEN ‘Martin Luther King Jr Day’
WHEN MONTH(DATE_KEY) = 2 AND DAYOFMONTH(DATE_KEY) = 14
THEN ‘Valentine’‘s Day’
WHEN MONTH(DATE_KEY) = 2 AND DAYOFWEEK(DATE_KEY) = 2 AND ORDINAL_WEEKDAY_OF_MONTH = 3
THEN ‘President’‘s Day’
WHEN MONTH(DATE_KEY) = 3 AND DAYOFMONTH(DATE_KEY) = 17
THEN ‘Saint Patrick’‘s Day’
WHEN MONTH(DATE_KEY) = 5 AND DAYOFWEEK(DATE_KEY) = 1 AND ORDINAL_WEEKDAY_OF_MONTH = 2
THEN ‘Mother’‘s Day’
WHEN MONTH(DATE_KEY) = 5 AND DAYOFWEEK(DATE_KEY) = 2
AND LAST_VALUE(DAYOFMONTH(DATE_KEY)) OVER (PARTITION BY MONTH_KEY
ORDER BY TO_DATE(DATE_KEY)) - 7 <= DAYOFMONTH(DATE_KEY)
THEN ‘Memorial Day’
WHEN MONTH(DATE_KEY) = 6 AND DAYOFWEEK(DATE_KEY) = 1 AND ORDINAL_WEEKDAY_OF_MONTH = 3
THEN ‘Father’‘s Day’
WHEN MONTH(DATE_KEY) = 7 AND DAYOFMONTH(DATE_KEY) = 4
THEN ‘Independence Day’
WHEN MONTH(DATE_KEY) = 9 AND DAYOFWEEK(DATE_KEY) = 2 AND ORDINAL_WEEKDAY_OF_MONTH = 1
THEN ‘Labor Day’
WHEN MONTH(DATE_KEY) = 10 AND DAYOFWEEK(DATE_KEY) = 2 AND ORDINAL_WEEKDAY_OF_MONTH = 2
THEN ‘Columbus Day’
WHEN MONTH(DATE_KEY) = 10 AND DAYOFWEEK(DATE_KEY) = 31
THEN ‘Halloween’
WHEN MONTH(DATE_KEY) = 11 AND DAYOFWEEK(DATE_KEY) = 5 AND ORDINAL_WEEKDAY_OF_MONTH = 4
THEN ‘Thanksgiving Day’
WHEN MONTH(DATE_KEY) = 12 AND DAYOFMONTH(DATE_KEY) = 25
THEN ‘Christmas Day’
WHEN MONTH(DATE_KEY) = 12 AND DAYOFMONTH(DATE_KEY) = 26
THEN ‘Boxing Day’
ELSE NULL
END AS VARCHAR(50)) AS HOLIDAY_DESC
,CAST(CASE WHEN HOLIDAY_DESC IS NULL THEN 0 ELSE 1 END AS BOOLEAN) AS IS_HOLIDAY
FROM DATE_SPINE

{{
config(
post_hook=‘{{ add_not_null_constraints([“DATE_KEY”
,“DAY_OF_YEAR”
,“DAY_SUFFIX”
,“WEEK_KEY”
,“WEEK_OF_YEAR”
,“DAY_OF_WEEK”
,“WEEK_DAY_SHORT_DESC”
,“WEEK_DAY_DESC”
,“FIRST_DAY_OF_WEEK”
,“LAST_DAY_OF_WEEK”
,“MONTH_KEY”
,“MONTH_OF_YEAR”
,“DAY_OF_MONTH”
,“MONTH_SHORT_DESC”
,“MONTH_DESC”
,“FIRST_DAY_OF_MONTH”
,“LAST_DAY_OF_MONTH”
,“QUARTER_KEY”
,“QUARTER_OF_YEAR”
,“DAY_OF_QUARTER”
,“QUARTER_SHORT_DESC”
,“QUARTER_DESC”
,“FIRST_DAY_OF_QUARTER”
,“LAST_DAY_OF_QUARTER”
,“YEAR_KEY”
,“FIRST_DAY_OF_YEAR”
,“LAST_DAY_OF_YEAR”
,“IS_HOLIDAY”]) }}’
)
}}

2 Likes

Hey @josh!

I wrote about the approach we took at GitLab around this a bit ago. Take a look and let me know what you think: Building a Calendar Table using dbt

Thanks @emilie I took a look at the date dimension code that you referenced. Looks like we have many similar columns. You also have some fiscal year columns that we don’t. On the other hand, we have some additional columns like DAY_SUFFIX (e.g. “1st”, “2nd”), WEEK_DAY_DESC (not just “Wed” but “Wednesday”, "Thursday), some keys like MONTH_KEY, and, most importantly, a holiday indicator and actual names for specific holidays.

Also I think some of the calculations in your date dimension could be simplified as we have done, e.g. instead of:

FIRST_VALUE(date_day) OVER (PARTITION BY year_actual, month_actual ORDER BY date_day) AS first_day_of_month,

you could do:

TRUNC(date_day, 'Month')

for this:

decode(extract('month',date_day),

           1, 'January',

           2, 'February',

           3, 'March',

           4, 'April',

           5, 'May',

           6, 'June',

           7, 'July',

           8, 'August',

           9, 'September',

           10, 'October',

           11, 'November',

           12, 'December') AS month_name,

you could do:

TO_CHAR(date_day, 'MMMM')

and for this:

(year_actual || '-' || DECODE(extract(quarter,date_day),

           1, 'Q1',

           2, 'Q2',

           3, 'Q3',

           4, 'Q4')) AS quarter_name,

you could do:

'Q' || EXTRACT(QUARTER FROM date_day)

Just simplifying and making the code cleaner. You could also simplify many of the other window functions in the same way using the TRUNC() and LAST_DAY() functions.

Do you really not have a need for holiday calculations? We often need to do workday calculations or similar things and need to know the holidays for that.

1 Like

Very cool! Just wanted to mention this macro I added to the dbt-date package a few weeks ago that does some of that as well.

1 Like

@josh Thanks for the feedback. I created https://gitlab.com/gitlab-data/analytics/merge_requests/2121 to make those improvements and add some global holidays. We haven’t had a need for them yet, but that’s no reason not to add it!

1 Like

Hi there, how do you guys do with the function concat and year, mine is not working.