BigQuery ingestion-time partitioning and partition copy with dbt

system · March 10, 2023, 6:17pm

This is a companion discussion topic for the original entry at BigQuery ingestion-time partitioning and partition copy with dbt | dbt Developer Blog

sourav.singh · April 26, 2023, 2:35pm

Thanks for posting on this topic. Even I am trying to create tables with ingestion time partitioning which would create tables with daily partitions with a pseudo column. But I am getting issues while creating this. Any suggestions on it would be really helpful.
{{ config(
materialized = ‘incremental’,
incremental_strategy = ‘insert_overwrite’,
partition_by = {
“field”: “day”,
“data_type”: “date”,
“time_ingestion_partitioning”: true
}
) }}
select
x,y,z
FROM a.b.c
WHERE
status = ‘A’
LIMIT 100
I am getting below error :
14:20:52 Database Error in model
14:20:52 PARTITION BY expression must be _PARTITIONDATE, DATE(_PARTITIONTIME), DATE(<timestamp_column>), DATE(<datetime_column>), DATETIME_TRUNC(<datetime_column>, DAY/HOUR/MONTH/YEAR), a DATE column, TIMESTAMP_TRUNC(<timestamp_column>, DAY/HOUR/MONTH/YEAR), DATE_TRUNC(<date_column>, MONTH/YEAR), or RANGE_BUCKET(<int64_column>, GENERATE_ARRAY(<int64_value>, <int64_value>[, <int64_value>]))

sourav.singh · April 26, 2023, 2:42pm

Adding “day” in select stement says it’s not recognised :
select
day,
campaign_id,
NULLIF(COUNTIF(action = ‘impression’), 0) impressions_count
from {{ source(‘logs’, ‘tracking_events’) }}

christophe.oudar · August 2, 2023, 4:19pm

I just happen to see this thread (a bit late…).

14:20:52 Database Error in model
14:20:52 PARTITION BY expression must be _PARTITIONDATE, DATE(_PARTITIONTIME), DATE(<timestamp_column>), DATE(<datetime_column>), DATETIME_TRUNC(<datetime_column>, DAY/HOUR/MONTH/YEAR), a DATE column, TIMESTAMP_TRUNC(<timestamp_column>, DAY/HOUR/MONTH/YEAR), DATE_TRUNC(<date_column>, MONTH/YEAR), or RANGE_BUCKET(<int64_column>, GENERATE_ARRAY(<int64_value>, <int64_value>[, <int64_value>]))

is likely linked to the fact it wasn’t working for “data_type”: “date” until the very recent release dbt 1.6.

In your example, you need the day column to be explicit in the output of the model, it will be wrapped and embedded as _PARTITIONTIME by behind the scenes queries in dbt.

jsdegard · January 20, 2024, 4:37pm

Hello. I too would like to use the insert_overwrite strategy with bq cop (copy_partitions). however, the tmp table dbt generates as source to copy from does not include required column settings or clustering keys and so it fails for me unless I make every column in the target nullable and remove the clustering keys. We are on 1.4, do you know if this is addresses in later versions?

christophe.oudar · September 21, 2024, 11:48am

copy_partitions option leverage the output of your model to copy but BigQuery has some limitations such as ensuring the the clustering and columns are the same. So far, dbt isn’t ensuring that it is aligned with your existing target table. That could be an interesting feature as dbt isn’t able yet to update clustering settings (since it’s stateless regarding the target). If it’s not already in the existing issues, feel free to request it on Issues · dbt-labs/dbt-bigquery · GitHub

Topic		Replies	Views
Understanding BigQuery Date Partitioning in dbt Archive	0	5911	December 27, 2018
Issue in time_ingestion_partitioning and copy_partitions in incremental model Help incremental , bigquery , dbt-core	0	80	July 16, 2024
BigQuery + dbt: Incremental Changes Archive	8	40226	June 19, 2020
How to dynamically generate input to `partitions` config in BigQuery Archive	1	5570	September 18, 2019
BigQuery date partitioning different from UI? Archive	3	2888	April 17, 2020

BigQuery ingestion-time partitioning and partition copy with dbt

Related topics