Handling soft deletes in dbt snapshots

Hello,

I am setting up snapshots using dbt using the recommended timestamp strategy.

I see that there’s a configuration option for hard deletes. In which case the dbt_valid_to field is set to the current_timestamp.

I wanted to leverage the snapshots for soft deletes as well.
But I wasn’t able to find a way to close the record on a soft delete.

Apart from implementing a custom snapshotting strategy, I’m not sure if there’s anything else that can be done.

Has anyone encountered this issue / have any thoughts on how best to deal with soft deletes?

What I mean by closing the record is use the value in a soft deleted column like deleted_at, in the dbt_valid_to field rather than the current_timestamp.

Hi Harsh - if you have a soft delete column in the table, you can create a view on top of the snapshot to give you the custom logic you want. case when soft_delete_flag = ‘Y’ then dbt_updated_at else dbt_valid_to end as dbt_valid_to. Or something like that…

We use a view to provide user friendly active flags and start/end dates for end user consumption.

The timestamp strategy will work out of the box, if your updated_at field is set when the records are soft deleted.

Otherwise, a check strategy will work, by including the deleted_at or is_deleted field in the check_cols list, like: check_cols=["updated_at", "deleted_at"].

See the docs for the check strategy here: Snapshots | dbt Docs (getdbt.com)

Yes, we did evaluate the option of creating a view. I think it is the best option we have so far.

Thank you for your response.

I think for the check_cols strategy the valid_from and valid_to would be the time at which the snapshot is taken. We would want to avoid that.

Because we know the entire event history we have the exact timestamps at which things changed / got deleted.

Thank you for your response.