incremental model (Merge) without update

rjames · March 27, 2023, 11:09pm

Hi,
Is there a way to use merge incremental strategy without data getting updated if the Unique id is already available in the destination. The merge incremental strategy inserts records if there is no matching id in destination and updates the columns if the ID is already present in the destination. I want to ignore the incoming records if the unique ID is already available in the destination and insert only new records.

joellabes · March 27, 2023, 11:30pm

dbt will only insert records that are returned by your query. Inside of your is_incremental block, if you use a not exists clause to ignore any records whose unique ID is already in the destination, then it will only return the new records which means they’re the only ones that will be inserted.

Surya · March 28, 2023, 11:18am

create your own incremental strategy and configure the strategy to ur model using config function
you create the incremental strategy macro and write your merge logic.
dbt identifies the macro with the name get_incremental_{strategy}_sql so you have to create macro with the above name.
dbt internally passes the below arguments to the user defined incremental strategy macro

{'target_relation': target_relation, 'temp_relation': tmp_relation, 'unique_key': unique_key, 'dest_columns': dest_columns, 'predicates': incremental_predicates }

you have to return a merge sql from the macro, dbt runs the sql on the configured database

user defined incremental strategy:

{% macro get_incremental_{strategy}_sql(arg_dict) %}
 merge logic
  {{ return (merge_sql_query) }}

{% endmacro %}

joellabes · March 28, 2023, 10:38pm

This is true, but is probably overkill given that the default merge strategy will work out of the box if it’s given the correct rows to work with.

rjames · April 12, 2023, 6:47pm

Thank you for you reply on the incremental model. I am looking for a way to do an incremental model if the there is a change in data(Changing attributes). eg. If ID 123 is loaded in model already and if the id 123 is flowing in again in the next run, i need to check if there is change in data (other fields like rates ,amount ,name etc.) and update it. if not I need to ignore the upsert for that ID. Is there is way to implement this with existing dbt feature or do we need to form a query to support this?

Surya · April 13, 2023, 5:14am

use an audit_field(timestamp) in your source table which tells us when the record is updated or inserted. Use this audit field in ur incremental logic to filter updated/inserted records.

Example:-

select
  col_a,
col_b,
col_c

from raw_app_data.events

{% if is_incremental() %}

  -- this filter will only be applied on an incremental run
  where event_time > (select max(event_time) from {{ this }})

{% endif %}

here raw_app_data.events is the source model and it has a audit field event_time

joellabes · April 20, 2023, 4:14am

This is exactly right - dbt assumes that incremental models have some way of identifying which records have changed. Ideally you can use the modified_date or something on the source table.

Otherwise you would need to check each column whose values you care about inside of your is_incremental()block

system · April 27, 2023, 4:15am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
incremental merge based on ID Help incremental , snowflake	1	268	August 10, 2024
Incremental model (Merge) without update part 2 Help incremental , dbt-core	0	40	March 28, 2025
Incremental_strategy = 'merge' Help incremental , postgres , dbt-core	3	425	August 19, 2024
Is it possible to customize incremental_strategy? Help incremental	7	4012	January 10, 2024
incremental model + unique constraint still allows duplicates Help incremental	4	453	December 11, 2024

incremental model (Merge) without update

Related topics