DIM table creation(SCD2), source selection

Borjoyzee · May 20, 2023, 12:39am

Hi guys,

I am completely new to dbt and there are few things a cant get my head around.

In simple: dbt can act as a substitute for datapipeline created in Snowflake using streams/tasks etc.?

Imagine I have source, each day this system extract data to S3. My goal is to created dimensional star schema.

In Snowflake I have external tables build on top of S3, So for example customer table will look like intial_load/delta_day1/delta_day2 and so on. So it will be growing each day.

What should by my source for DIM tables? I get using snapshots. So I have also streams build on top of these external tables. And from these streams I am building snapshots.

So these stream I am using as SOURCES in dbt (for snapshots and snapshots for models down the line). Is this correct approach? Or should I use as source external tables? Or something different? What if something bad happend a source system will dump new data before snapshot creation? I will then have in stream 2 days, because stream wont be consumed. How can I how can I handle correct sequence loading into snapshot. So my history will be correct? What if I have to go back in time and regenerate the last three days?

Thank you

Topic		Replies	Views
How to use dbt with Snowflake as source but write models locally ? Help	0	26	June 23, 2025
Snowflake Streams for incremental dbt loads Archive	0	3749	June 21, 2022
Snowflake Streams with DBT In-Depth Discussions incremental , snowflake	7	10543	September 12, 2024
Initialise snapshot with data from a dimension table Help snowflake	2	1026	September 1, 2023
How do I turn 2 years of source data into a snapshot table Help snapshots , snowflake , dbt-cloud	0	56	February 20, 2025

DIM table creation(SCD2), source selection

Related topics