Every often, someone asks us about finding the active days for a subscription/user/account. We call this pattern “date spining”
For example, this question was asked on dbt Slack last week:
Any idea how to calculate the active products historically (we have products expiring after 1 year from the purchase, they can also be cancelled). It would be good to see how active products change over time.
So let’s assume you can get your data into this shape, in a model called product_subscription_periods
:
subscription_id | user | product | start_date | end_date |
---|---|---|---|---|
1 | emily | widgets | 2018-01-01 | 2019-01-01 |
2 | jack | gizmos | 2018-05-02 | 2018-06-01 |
We want to create one record per day that the subscription was active, like this:
subscription_id | user | product | date_day |
---|---|---|---|
1 | emily | widgets | 2018-01-01 |
1 | emily | widgets | 2018-01-02 |
… | … | … | … |
1 | emily | widgets | 2018-12-31 |
2 | jack | gizmos | 2018-05-02 |
2 | jack | gizmos | 2018-05-03 |
… | … | … | … |
2 | jack | gizmos | 2018-05-31 |
To do this, you need a list of all dates in your warehouse. We often use the dbt_utils.date_spine
macro for this, to materialize a table of dates out of thin air (no matter which data warehouse you use!).
Our util_days
model will look something like:
{{ dbt_utils.date_spine(
datepart="day",
start_date="'2018-01-01'",
end_date="dateadd(month, 1, date_trunc('month', getdate()))"
)
}}
To build our product_subscriptions_by_day
model, the SQL looks something like:
with product_subscription_periods as (
select * from {{ ref('product_subscription_periods') }}
),
util_days as (
select * from {{ ref('util_days') }}
),
spined as (
select
product_subscription_periods.subscription_id,
product_subscription_periods.user,
product_subscription_periods.product,
util_days.date_day
from product_subscription_periods
left join util_days
-- start date is before the date we want in our table
on product_subscription_periods.start_date <= util_days.date_day
-- end date is after the date we want in out model, or there is no end date
and (product_subscription_periods.end_date > util_days.date_day
or product_subscription_periods.end_date is null)
)
select * from spined
Now that we have this model, it makes it really easy to calculate metrics in our BI tool, e.g. how many active subscriptions we have on each day:
select
date_day,
count(subscription_id) as n_active_subscriptions,
count(case when product = 'widgets' then subscription_id end) as n_active_widget_subscriptions,
count(case when product = 'gizmos' then subscription_id end) as n_active_gizmo_subscriptions,
count(distinct user) as n_distinct_users_with_subscriptions
from analytics.product_subscriptions_by_day
group by 1
We use this pattern to model monthly recurring revenue based on subscriptions, to calculate daily active users, to model inventory levels, and a lot more!
Want to see it in action? We use this pattern in the Monthly Recurring Revenue (MRR) playbook.
For a more in-depth example, the team at GitLab have open sourced their dbt project – check out their date spine model and an amortized subscription model that joins to it. Thanks @tmurphy and @emilie!