# Modeling changes to daily/weekly/monthly active users

Jump straight to the project docs, or to the source code

### Our data

Let’s say we have a table with one record per user, per day they were active:

user_id date_day is_active
1 2020-08-01 true
1 2020-09-01 true
1 2020-09-04 true
1 2020-09-05 true
1 2020-09-12 true
1 2020-09-17 true
2 2020-09-01 true

### The problem

We’ve been asked to create reports for daily, weekly, and monthly active users.

On the surface, we know that we could calculate daily active users like this:

Daily active users

``````select
date_day,
count(distinct user_id) as daily_active_users
from {{ ref('user_active_days') }}
group by 1
order by 1
``````
date_day daily_active_users
2020-08-13 1
2020-09-01 2
2020-09-04 1
2020-09-05 1

(Note, we haven’t filled in the missing days, which might be something worth doing )

We can follow similar patterns for weekly and monthly active users, to generate outputs like this.

Weekly active users SQL
``````select
date_trunc('week', date_day) as date_week,
count(distinct user_id) as weekly_active_users
from {{ ref('user_active_days') }}
group by 1
order by 1
``````
date_week weekly_active_users
2020-08-10 1
2020-08-17 1
2020-08-24 1
2020-08-31 2
2020-09-07 1
2020-09-14 1
Monthly active users SQL
``````select
date_trunc('month', date_day) as date_month,
count(distinct user_id) as monthly_active_users
from {{ ref('user_active_days') }}
group by 1
order by 1
``````
date_month monthly_active_users
2020-08-01 1
2020-09-01 2

However, wouldn’t it be if we could see this changing on a daily basis? That can feel strange when thinking about “monthly active users” for a day in the middle of the month, but essentially we want to get to something like this — for each day:

• How many active users were there today?
• How many active users were there in the last week / 7 days?
• How many active users were there in the last month / 30 days?

### GOAL TABLE

date_day daily_active_users weekly_active_users monthly_active_users
2020-09-01 2 2 2
2020-09-02 0 2 2
2020-09-03 0 2 2
2020-09-04 1 2 2
2020-09-05 1 2 2
2020-09-06 0 2 2
2020-09-07 0 2 2
2020-09-08 0 1 2
2020-09-09 0 1 2

The “semi-additive” nature of these metrics means that we can’t just add up our daily active users each week to get the number of weekly active users, so the way to get here may not feel immediately obvious.

### The solution

Fortunately we’ve seen this pattern before!

Before we can build the summary table, we need to build a table with one record per user, per day (regardless of whether they were active that day or not), and fill in columns for:

• `active_today`
• `active_l7_days`
• `active_l30_days`

To solve this, you’ll need to use a date spine and some window functions.

SQL to create this table
``````with user_active_days as (
select * from {{ ref('user_active_days') }}
),

all_days as (
select * from {{ ref('all_days') }}
),

user_first_active_day as (
select
user_id,
min(date_day) as first_active_day
from user_active_days
group by 1
),

-- use a date spine to fill in missing days
spined as (
select
user_first_active_day.user_id,
all_days.date_day
from user_first_active_day
left join all_days
-- all days after the user was first active
on all_days.date_day >= user_first_active_day.first_active_day
),

filled as (
select
spined.date_day,
spined.user_id,

coalesce(user_active_days.is_active, false) as is_active_today,

max(is_active_today) over (
partition by spined.user_id
order by spined.date_day
rows between 6 preceding and current row
) as is_active_l7_days,

max(is_active_today) over (
partition by spined.user_id
order by spined.date_day
rows between 29 preceding and current row
) as is_active_l30_days

from spined

on spined.date_day = user_active_days.date_day
and spined.user_id = user_active_days.user_id
)

select * from filled

``````
date_day user_id is_active_today is_active_l7_days is_active_l30_days
2020-08-13 1 true true true
2020-08-14 1 false true true
2020-08-15 1 false true true
2020-08-16 1 false true true
2020-08-17 1 false true true
2020-08-18 1 false true true
2020-08-19 1 false true true
2020-08-20 1 false false true
2020-08-21 1 false false true
2020-08-22 1 false false true

Once we have this, it’s a reasonably simple hop from that table to our goal table (shown above):

``````with active_users as (
select * from {{ ref('user_active_days_spined') }}
),

final as (
select
date_day,

sum(is_active_today::integer) as daily_active_users,
sum(is_active_l7_days::integer) as weekly_active_users,
sum(is_active_l30_days::integer) as monthly_active_users

from active_users

group by 1
)

select * from final
order by date_day

``````

### See the code

We’ve published some sample code for this over here and deployed the docs for this project here. Feel free to explore! (Note: the code has only been written for Snowflake)

4 Likes