I’m trying to create a date spine that where the first and last timestamps are taken from another model.
For example, if I have a table that shows the times that someone logs onto their computer, I want to create a date spine that will take the earliest logon time and the latest logon time. I can use dbt_utils.date_spine if I enter set dates for the beginning and end of the spine. I have tried a few different methods of setting the start and end timestamps using variables, such as select min(logon_time) from activity_table but they don’t work.
Does dbt_utils.date_spine have to take set dates as arguments or is there a way to use variables?
Hey @dschupak hopefully you figured this out by now. I ran into the same issue and came up with a solution that works, but I don’t feel 100% great about it. Perhaps one of the experts here can chime in!
{% set get_first_sub_date %}
select min(subscription_start_date)::string from {{ ref('stg_salesforce_sbqq_subscription') }}
{% endset %}
{%- set first_sub_date = dbt_utils.get_single_value(get_first_sub_date, default="'2017-01-01'") -%}
{% set get_last_sub_date %}
select max(subscription_end_date)::string from {{ ref('stg_salesforce_sbqq_subscription') }}
{% endset %}
{%- set last_sub_date = dbt_utils.get_single_value(get_last_sub_date, default="'2026-12-31'") -%}
{{ dbt_utils.date_spine(
datepart="day",
start_date="'" + first_sub_date + "'",
end_date="'" + last_sub_date + "'"
)
}}
What particularly didn’t feel great was the parts where I had to cast the result to a ::string, and then later when using it in date_spine having to manually quote it.
I’m curious if anyone found a different way to solve this problem without the casting/manually quoting! I have been running into this problem a lot when combined data about campaigns in salesforce against ad campaigns where they constantly need to be left joined to a date spine.
This is close! If you have a look at how the date_spine macro is implemented, you’ll see that the first step is to run a datediff of start_date and end_date. This means that if you pass something in that is valid SQL, it will evaluate the valid SQL.
In this case, you just need to wrap it in brackets:
{{ dbt_utils.date_spine(
datepart="second",
start_date="(select min(logon_time) from ref('activity_table'))",
end_date="(select max(logon_time) from ref('activity_table'))",
) }}`
If you’re using a ref, you’ll need to quote them as I’ve done above, alternating single and double quotes.
@ohitsmekatie in your case, you might prefer to pregenerate a date_spine table with all days between, say, 2000 and 2050 (whatever dates make sense).
That’s my normal recommendation, but it isn’t very viable when you’re doing second-by-second spining like the original post was. Salesforce ad campaigns are probably at a more tractable granularity though!
{{ dbt_utils.date_spine(
datepart="day",
start_date="cast('2019-01-01' as date)",
end_date="(select max(date) from ref('dim_date'))"
) }}
Database Error in rpc request (from remote system.sql) Table-valued function not found: ref at [7:37]
But this DOES
{{ dbt_utils.date_spine(
datepart="day",
start_date="cast('2019-01-01' as date)",
end_date="(select max(date) from " + ref('dim_date') | string + ")"
) }}
It is really nice macro although it is limited. I was thinking to use this parameter as :
datepart=“15 minutes”.
Also macro does not have flag working days or is_holiday like in calendar dimension, that is why calendar table can still be used as CTE expression for some cases better.