Duplicate Records on First run with Incremental Model on BigQuery

guzeloglusoner · August 27, 2021, 12:47pm

Hello dbt folks!

On one of the use case we have on our organisation, we have incremental tables which basically hold the append-only records of incoming events and current tables which stores the latest snapshots of incremental records with a unique key.

Above use-case looked us like an exact match to implement incremental model for good.

Documentation states that;

the first time a model is run, the table is built by transforming all rows of source data.

Because we have more than one record with the same unique key on our append-only incremental tables, the first incremental run generates more than one record with same unique key on current tables. Hence, consecutive batch following error;

UPDATE/MERGE must match at most one source row for each target row

Could anyone please let me know how this issue can be addressed with a solution or am I missing something?

Thanks in advance,
Soner

jbagatelli · November 23, 2021, 6:59pm

@guzeloglusoner
I ran into the same problem. Have you ever found a solution for this? If so, would you be able to share?

Thanks,

JB

minhajpasha · November 25, 2021, 11:30am

Hello ,The model need to filter the duplicate records for eg: use of ROW NUMBER on unique key in SQL to filter the records as it’s first run and dbt don’t any records to do the compare and perform the activities .
in details explanation on below link

wayde · November 29, 2021, 8:44pm

Hey @guzeloglusoner @jbagatelli

If your intention is to have multiple records with the same unique key, then you should create a new primary key (using dbt_utils.surrogate_key for example) and use that as the unique_key.

If your intention is to not have multiple records with that unique key, then you need to implement some deduplication logic as minhajpasha suggested.

Hope that helps!

Topic		Replies	Views
incremental model + unique constraint still allows duplicates Help incremental	4	824	December 11, 2024
incremental model fails on second run Help incremental	0	1071	August 28, 2023
Handling merge statements in incremental models Help	2	14281	July 23, 2019
A few beginner questions on incremental capability Archive	3	3086	December 30, 2019
Am I using incremental table correctly? Help incremental , bigquery	6	2918	May 14, 2024

Duplicate Records on First run with Incremental Model on BigQuery

Related topics