This is a companion discussion topic for the original entry at BigQuery ingestion-time partitioning and partition copy with dbt | dbt Developer Blog
Thanks for posting on this topic. Even I am trying to create tables with ingestion time partitioning which would create tables with daily partitions with a pseudo column. But I am getting issues while creating this. Any suggestions on it would be really helpful.
{{ config(
materialized = ‘incremental’,
incremental_strategy = ‘insert_overwrite’,
partition_by = {
“field”: “day”,
“data_type”: “date”,
“time_ingestion_partitioning”: true
}
) }}
select
x,y,z
FROM a.b.c
WHERE
status = ‘A’
LIMIT 100
I am getting below error :
14:20:52 Database Error in model
14:20:52 PARTITION BY expression must be _PARTITIONDATE, DATE(_PARTITIONTIME), DATE(<timestamp_column>), DATE(<datetime_column>), DATETIME_TRUNC(<datetime_column>, DAY/HOUR/MONTH/YEAR), a DATE column, TIMESTAMP_TRUNC(<timestamp_column>, DAY/HOUR/MONTH/YEAR), DATE_TRUNC(<date_column>, MONTH/YEAR), or RANGE_BUCKET(<int64_column>, GENERATE_ARRAY(<int64_value>, <int64_value>[, <int64_value>]))
Adding “day” in select stement says it’s not recognised :
select
day,
campaign_id,
NULLIF(COUNTIF(action = ‘impression’), 0) impressions_count
from {{ source(‘logs’, ‘tracking_events’) }}
I just happen to see this thread (a bit late…).
14:20:52 Database Error in model
14:20:52 PARTITION BY expression must be _PARTITIONDATE, DATE(_PARTITIONTIME), DATE(<timestamp_column>), DATE(<datetime_column>), DATETIME_TRUNC(<datetime_column>, DAY/HOUR/MONTH/YEAR), a DATE column, TIMESTAMP_TRUNC(<timestamp_column>, DAY/HOUR/MONTH/YEAR), DATE_TRUNC(<date_column>, MONTH/YEAR), or RANGE_BUCKET(<int64_column>, GENERATE_ARRAY(<int64_value>, <int64_value>[, <int64_value>]))
is likely linked to the fact it wasn’t working for “data_type”: “date”
until the very recent release dbt 1.6.
In your example, you need the day
column to be explicit in the output of the model, it will be wrapped and embedded as _PARTITIONTIME by behind the scenes queries in dbt.
Hello. I too would like to use the insert_overwrite strategy with bq cop (copy_partitions). however, the tmp table dbt generates as source to copy from does not include required column settings or clustering keys and so it fails for me unless I make every column in the target nullable and remove the clustering keys. We are on 1.4, do you know if this is addresses in later versions?
copy_partitions
option leverage the output of your model to copy but BigQuery has some limitations such as ensuring the the clustering and columns are the same. So far, dbt isn’t ensuring that it is aligned with your existing target table. That could be an interesting feature as dbt isn’t able yet to update clustering settings (since it’s stateless regarding the target). If it’s not already in the existing issues, feel free to request it on Issues · dbt-labs/dbt-bigquery · GitHub