google cloud bigquery DDL costs\runtime - partitions_to_replace vs _dbt_max_partition vs copy_partitions

eli.sigal · September 5, 2023, 9:08am

I am trying to figure out how do costs are calculated correctly when using dbt partitions_to_replace vs _dbt_max_partition vs copy_partitions

Please note when I say costs i look it up under results for every query that is created by dbt and look at MIB processed.

Did tests and finding may vary of course. on a 44 GIB table:

_dbt_max_partition

month_start_date >= date_sub(_dbt_max_partition, interval N day) logic costs overall 27.07 GIB to process which is quite a lot.

partitions_to_replace

month_start_date in ({{ partitions_to_replace | join(‘,’) }}) - did the the exact same GIB and running time which is disappointing (im using the latest dbt version 1.6.1 -config and is_incremental below)

{{
    config(
        materialized='incremental',
        partition_by = { 'field': 'created_date', 'data_type': 'date',"granularity": "month" },
        cluster_by = ["first_line_item","is_first_order","customer_id","transaction_id"],
        incremental_strategy = 'insert_overwrite',
        on_schema_change='sync_all_columns',
        partitions = partitions_to_replace
    )
}}

{% if is_incremental() %}
where month_start_date in ({{ partitions_to_replace | join(',') }})
{% endif %}

copy_partitions

It ran twice the amount of time and took twice the amount of $.

Questions regarding the findings

a. Do I look at costs correctly?
b. why does the partitions_to_replace vs _dbt_max_partition takes the exact same time?
c. why does the copy_partitions runs twice the query and twice the COPY, is this expected??
d. copy_partitions runs twice the time - is it due to the twice running logic above or because the copy for each partition is done sequential?
e. did I do correct the config above for partitions_to_replace?

Any help would be appreciated

eli.sigal · September 5, 2023, 3:59pm

So to add regarding the copy_partitions

There is a current bug and PR on github for this

so the questions are only about the other 2 ways

Topic		Replies	Views
DBT Incremental insert_overwrite static partitions Help incremental , bigquery	0	2638	July 4, 2023
BigQuery + dbt: Incremental Changes Archive	8	39328	June 19, 2020
DBT BigQuery Scanning Entire table in merge Help bigquery	1	1873	May 10, 2023
DBT BigQuery table creation code generation issue Help incremental , bigquery	7	2373	April 1, 2023
Bigquery Cost for temporary table for incremental run In-Depth Discussions	0	76	October 8, 2024

google cloud bigquery DDL costs\runtime - partitions_to_replace vs _dbt_max_partition vs copy_partitions

Related topics