dbt recreates the whole table?

heinistic · July 24, 2024, 7:16am

The problem I’m having

Not a problem but more of we want to understand what’s happening. We noticed that dbt recreates the whole table and processed 41TB of data instead of just running it for a specific processing date.

What we noticed:

dbt runs our models and create it under model_name__dbt_tmp
dbt recreates the whole table by running; the table was already existing and had a lot of data. Because of this it processed 41TB of data

create or replace table model_name as (
 select
    col1,
    col2,
from model_name
);

dbt runs partition merges into the same table

Some settings we have;

incremental matieralization
require_partition_filter = true
+on_schema_change: "sync_all_columns"

Question: Why is it doing step 2 on what I listed above?

Thank you!

Topic		Replies	Views
full-refresh is not rebuilding the table but replacing it with only new data Help databricks , dbt-core	3	720	July 24, 2024
DBT BigQuery table creation code generation issue Help incremental , bigquery	7	2385	April 1, 2023
Dbt temp table not optimized Help	3	1067	November 8, 2023
Incremental model runs only like "create or replace table..." Help incremental , databricks	6	3719	October 24, 2023
Handling BigQuery Incremental __dbt_tmp tables Help incremental , best-practice , bigquery	2	5757	March 29, 2023

dbt recreates the whole table?

The problem I’m having

Question: Why is it doing step 2 on what I listed above?

Related topics