Recently started using dbt, loving it so far, however I noticed something kind of weird with the incremental models.
I am using incremental models to create the “intermediate” level of my data, as the raw data is JSON based, I want to extract the keys that I need, and populate a table that will be relational.
To avoid going over the entire dataset every time, we decided to use Incremental models, with merge strategy, and a “on_schema_change=append_new_columns” approach.
Once we started using the models, we noticed that dbt creates “temporary” tables for the data that needs to be merged, with a lifetime of 12 hours. However, as we run our transformations about every 30 minutes, these tables are always present in our BigQuery project.
We do want to keep using the incremental strategy, and appending new columns, but according to this git issue, it seems that we might be able to resolve it if we change the on_schema_change to a different value.
Since the data is mostly filtered down and is not too large, we don’t mind about the storage costs - but we do mind that there are x2 more tables in our project, since for every incremental model we made, we now have the output table X and also the temporary table X__dbt_tmp
Is there a way we can get dbt to self-clean those temporary tables? It seems that it should’ve done so naturally, but it doesn’t.
Thanks!