Hey @Koen, thanks for your question. The way I would handle records in the fact table that don’t match to a record in the dimension table is through the method that @josh suggested in his post here.
So, let’s say our dimension table is dim_user
, and we have the following columns:
customer_id
customer_key
(the surrogate key, created by usinghash(customer_id)
customer_name
In the dbt model used to generate dim_user
, I would add a row (using union all
) for the following record:
select
-1 as customer_id,
hash(-1) as customer_key,
'Unknown Customer' as customer_name
Then from the fact table (e.g. fact_sales
), I will try to generate a surrogate key that matches dim_user
.
select
sale_id,
price,
quantity,
hash(coalesce(customer_id, -1)) as customer_key -- if customer_id is null, -1 will be used instead.
from
{{ ref('staging_sales') }}
This way, the fact table will reference a customer with the customer name Unknown Customer
instead of null
which might cause confusion.