When you have an event stream, it’s common to need to find the latest non-null value in the stream and “forward-fill” it all subsequent events.
For instance, suppose you have a stream of web events. In one such event a user creates a profile with relevant properties (email, location, whatever). All subsequent events in the stream should have that information attached to the event. To do that, you need to forward fill the relevant properties.
Here’s a macro to do that based on this blog post:
{%- macro forward_fill(source, id, timestamp, field) -%}
with init_data as (
select
{{ id }} as id,
{{ timestamp }} as timestamp,
{{ field }} as field
from {{ source }}
),
grouped as (
SELECT
*
FROM
(
SELECT
id,
timestamp,
field,
COUNT(field) OVER (PARTITION BY id ORDER BY timestamp) as grouper
FROM
init_data
) as grouped
),
forward_filled as (
select
id,
timestamp,
field,
max(field) OVER (PARTITION BY id, grouper) as forward_filled
from grouped
order by id, timestamp
),
most_recent as (
select
distinct on (id) id,
forward_filled
from forward_filled
order by id, timestamp desc
)
select
*
from most_recent
{%- endmacro -%}
```