Store count of inserted messages from an incremental model into a separate metadata table

mattheword83 · July 12, 2024, 1:09am

What I’m trying to do

I’m trying to maintain a metadata table in BigQuery containing a count of inserted rows when an incremental model is run.

The context of why I’m trying to do this

I’m ingesting data from Pub/Sub into a BigQuery table partitioned on on the Pub/Sub publish_time metadata. The data for each message is stored in a JSON blob. I want to then process the messages into another table partitioned by the date field within the messages themselves. I want to make sure that all messages in the raw table have been processed by maintaining counts of how many messages have been processed from each partition and comparing them to the total number of messages currently in each partition.

What I’ve already tried

This illustrates what I’m trying to do:

        post-hook: |
          INSERT INTO raw_dataset.processed_partitions (table, row_count, partition, query_timestamp) 
          VALUES('table_name', @@row_count, partition, CURRENT_TIMESTAMP())

Unfortunately, the @@row_count from the model run isn’t available in the post-hook, so it always comes back null.

In a bigquery script I could just run the merge statement and then run another INSERT INTO into the processed partitions table using the @@row_count variable.

Is something like this possible to replicate in dbt?

Topic		Replies	Views
Error: dbt incremental model on BQ Partitioned table Help incremental , bigquery , dbt-core	3	628	August 22, 2024
Error: dbt incremental model on BQ Partitioned table using string field Help jinja , incremental , bigquery , dbt-core	7	795	August 15, 2024
DBT BigQuery Scanning Entire table in merge Help bigquery	1	1955	May 10, 2023
BigQuery + dbt: Incremental Changes Archive	8	39935	June 19, 2020
BigQuery ingestion-time partitioning and partition copy with dbt In-Depth Discussions devblog	6	2690	September 21, 2024

Store count of inserted messages from an incremental model into a separate metadata table

What I’m trying to do

The context of why I’m trying to do this

What I’ve already tried

Related topics