using `merge_exclude_columns` with ignore nulls - possible?

I have a users model that currently run with a full-refresh and i want to move it to be incremental with incremental_strategy=merge and run every hour.

The model has columns calculated with first_value and last_value functions, for example: first_time_login and last_time_login.
i know last_time_login column will still work properly but i have problem with first_time_login- i don’t want to override it in each run. to solve it, I found out i can use merge_exclude_columns.

But then i encountered another problem - the first_time_login might be NULL at initially which means it will never be updated again.
example: a user was created yesterday but made his first login only today. if i run hourly - the first_time_login will be set to NULL and will never be updated because of merge_exclude_columns.

My question:
Can i use merge_exclude_columns but excluding NULLS? meaning i want to:
1. get the first value which is not null
2. once i have updated value - never update it again

1 Like

I have this exact situation. Interested to hear the recommended solution.

Why not? It should stay static, shouldn’t it? It’ll be null until they first log in, and then on every subsequent invocation it will stay as the first time they logged in. So it seems like it shouldn’t really matter if it gets overwritten to the same value or if it’s ignored.

As i understand, if i include a column (say first_time_login) in merge_exclude_columns - the value there will be the value that i get from a full-refresh and will never change again.

So if i run a full-refresh and a user never logged in yet - i will get NULL. this NULL will never change.
I don’t want it - i to have a value of the first time the user logged in not necessarily from a full-refresh run

I found anohter solution rather using merge_exclude_columns for my use case.
I bring current table values with

    self_columns AS
    (
        SELECT user_id,
               first_time_login,
               last_time_login
               ....
        FROM {{ this }}
    ),

the final select for those columns looks like:

SELECT ...,
       COALESCE(self_columns.first_time_login, main.first_time_login) AS first_time_login,
       IFF(main.last_time_login IS NOT NULL, main.last_time_login, self_columns.last_time_login) AS last_time_login,
FROM main m
LEFT JOIN self_columns s ON m.user_id = s.user_id

This is correct, which is why I’m suggesting you not add it to merge_exclude_columns. By leaving it as a column that can be updated,