I’d like to setup my fact tables as incremental, but am not sure how to make sure they load correctly from multiple sources / refs.
Say I have a
fact_a , which pulls from
ref_c. I’d like
fact_a to refresh in cases where either
ref_c has been updated according to my incremental logic, but can’t figure out how to do this correctly.
An example of what I want:
Let’s say we join on
ref_b_id, which is the unique key for the
fact_a table. If I’m generating incremental rows for fact_a filtering on
updated_at > '2023-05-29', I want the new rows to be inserted to be:
We’ve added the rows where the source records were recently updated in either ref_b or ref_c.
I feel like this should be pretty straightforward but for some reason I’m having a hard time finding examples of others doing the same. All the examples I’ve seen only show joining on refs where both / all of the refs have been updated within the time window. We have fact tables with quite a few sources so ideally this would work across more than 2 tables