Fivetran and incremental loads

Hello Guys,

I’m thinking on using fivetran + dbt for my next Dwh project. I have one question, fivetran syncs the tables that I need to create My facts and dims tables. From what I know, usually we truncate the tables before loading the data from source system. This way we can really do a nice incremental load.

After all if I have a staging area from 1TB it does not make since to keep it forever there and reload all the facts and dim for multiple years if I just want to load 1 day.

Is there anyway approach to deal with this? Like delete the data from the staging area after it was inserted into the dim and fact tables

As systems get larger it would not be ideal to truncate and reload. What I think you should consider is an incremental load and there are things in Fivertran and dbt target databases that can help.

Example:

There is a column in Fivetrans managed tables (_FIVETRAN_SYNCED) that can help with what is new/updated.

https://fivetran.com/docs/getting-started/system-columns-and-tables

There are also source database features like streams in Snowflake that can do the change detection as well.

https://docs.snowflake.com/en/user-guide/streams.html

hope this helps

So we replicate the whole erp db into the dwh dB where dbt runs ? We keep this data forever there ?

Would be great to have an example how to deal with this …

What I would do now is:
1- sync the source table with fivetran
2 - create dim and fact tables with incremental loads
3 - truncate the source table with a post hook

This way The staging area from the dwh is always clean. If I need a re-sync then Incan trigger it with fivetran

Ist this correct ?

Jorge