Hi all,
I’d like to use dbt to build a DWH with multiple source systems (one for each company of the group) which I need to load at different schedules during the day.
The idea is to land to a common DWH data model, so that e.g. my INVOICE table will contain data coming from all the different source systems (each one with its own data structure, and thus with its own transformations).
I know I cannot have in dbt more than one model pointing to the same physical table, so I thought I could use 2 different approaches:
- create one dbt project for each source system (which is not my prefered choice)
- create an intermediate model for each source system, having the same data structure as the target, and then load the final DWH table with something like
select * from {{ ref( var('invoice') ) }}
giving a different value to var(‘invoice’) for each run depending on the source system I’m loading.
Did some of you had the same problem in the past?
How did you adress it?
Do you see any draw back in approach 2?
The bad thing I see is that documentation won’t be complete, i.e. the DAG will point to the value of var(‘invoice’) I used when I generated the documentation.
Thanks
Daniele