Hi,
I am new to dbt and we are trying to implement scd 2 using dbt. I learned about Snapshot, however, is there nay other way to implement scd 2 in dbt?
Thanks
Hi,
I am new to dbt and we are trying to implement scd 2 using dbt. I learned about Snapshot, however, is there nay other way to implement scd 2 in dbt?
Thanks
Why are you looking for an alternative to snapshots? Are you concerned that they won’t solve a specific use case you have in mind?
Thanks for reply Joellabes. There are couple of things, we want to check how we can handle scd 1 and scd 2 in same table. Also, we have multiple teams doing simultaneous development and has their own folder structure for modeling and we observed that, we cannot have snapshots in different folders as per each team.
Hence, we want to explore new avenues to handle scd 2 using dbt
Generally you should snapshot your source data and bring it in with zero transformations. Let other teams model off that raw data if they need to. I’d be curious of the use case that requires each team to have their own snapshot of raw data. Sounds like you want historical data and also the current status, to do that just create your type 2 dim and then a view on top that filters to the most current record.
Thanks for reply. The reason we are thinking out of box here is, snapshot creates table in DB while our policy states that, tables shall be created using scripts, Git check ins and Dev Opps programs. No out side program should create tables automatically.
We are looking into macros but no luck so far. If you have any specific suggestion, we are all ears
Appreciate your help.
dbt snapshots’ columns are knowable in advance (it’s just the columns of your existing snapshot table plus the four bookkeeping columns dbt_scd_id
, dbt_updated_at
, dbt_valid_from
and dbt_valid_to
), so you could always pre-generate your tables and have dbt snapshot populate them.
With that said, keep in mind that unless you’re building everything as views, dbt run
will also create tables…
I agree with @yoitsbenc - build a SCD2 table and then filter to current records (those where dbt_valid_to
is null) in a downstream model to get the SCD1 equivalent.
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.