can i create an scd type 2 model in dbt without using snapshot?

Hi Experts!

I am super new to dbt and have a use case to build an SCD type 2 model which would have valid_to and valid_from (and maybe a active flag) columns. This seems easy to do using a snapshot but I’ve read that one should not put snapshots on datamart models.

I’m uncertain why putting a snapshot on a datamart model is a bad idea if we just want to be able to see the history of values that the datamart model had at different points in time?

Can I build a SCD Type 2 model including valid_to and valid_from dates manually in dbt and is there a template I can look at how to do this if it exists?

I’ve looked through different posts on here but cannot find anything that will give me the valid_to, valid_from columns other than using snapshot.

I’d like to use snapshot, because it looks very easy and appears to be working but am hesistant due to the warnings suggesting best practice is to only apply this to source tables.

Thank you!

Hi @jackiesbauer, I think there is no problem to create a snapshot just to monitor your model if you are aware of the risks.

It is a best practice to use snapshots in sources because from the source you are able to recreate any model at that point in time since you have the raw data and you didn’t lose any information through transformations, casting, renaming, etc.

It is not a good practice to use snapshots in models because they are much more mutable than sources, and if you change the way the model is transformed, maybe you can not easily compare current data with historical data.

If you are aware that your snapshot can easily become a mess if you do not take care, I see no problem with using it in your models. It makes no sense to create an SCD type 2 other way since you will have the same problems.

2 Likes

Just keep in mind that you

  • cannot recreate columns
  • depending on the complexity of your model, will not have a reliable updated_at
  • If you apply business logic in a snapshot query, and this logic changes in the future, it can be impossible (or, at least, very difficult) to apply the change in logic to your snapshots.

source: Snapshots | dbt Developer Hub

Thank you for your answer @brunoszdl!
It feels more difficult to put the snapshots on the source models (three different ones) and then figure out how to join those snapshots downstream to create the datamart model so I think I will try to implement this way (knowing the possible drawbacks you ponted out).
I appreciate your time and answer!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.