Full refresh incremental model on subset

spartakbadalyan · April 18, 2024, 1:17pm

Hi everyone !

I was wondering if there’s a way to perform a full refresh on a subset of an incremental model. For example based on a date column.

Delete from my_table where date_column > "yyyy-mm-dd" would be the logical choice here, but it is a restricted environnement, and I can only run dbt commands from the cluster’s CLI.

I also thought of using a pre_hook but that means I have to modify my model and open a PR which isn’t very convenient.

Do you guys have any idea how I could solve this issue or a workaround to suggest ?

Thank you in advance for your time !

a_slack_user · April 18, 2024, 1:40pm

It’s very normal to want to run these types of DML statements on the data warehouse from time to time for maintenance, schema evolution etc. Like 10 years ago this was a huge part of the day-to-day work of a data engineer.

So the logical solution is to get your data/analytics engineers the ability to run DML statements against the data warehouse (probably through JIT privilege escalation) and have a process to make sure they don’t screw anything up too much when they do

_{Note: @Mike Stanley originally posted this reply in Slack. It might not have transferred perfectly.}

a_slack_user · April 18, 2024, 1:40pm

But dbt doesn’t really have tools to help you with this unfortunately

_{Note: @Mike Stanley originally posted this reply in Slack. It might not have transferred perfectly.}

spartakbadalyan · May 30, 2024, 10:05am

I mangaged to find a workaround

First create a macro :

{% macro delete_from_table(table_to_delete_from,date_column_name,delete_date_start) %}
{% set table_ref = ref(table_to_delete_from) %}
{% set sql %}


Delete from {{ table_ref }} where {{date_column_name}} > '{{delete_date_start}}'

{% endset %}

{% do run_query(sql) %}

{% endmacro %}

Run the macro :
dbt run-operation delete_from_table --args '{table_to_delete_from: my_table, date_column_name: updated_at, delete_date_start: 2024-05-29}'

system · June 6, 2024, 10:06am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Partial full refresh of the incremental tables Help dbt-core	2	1113	February 13, 2024
Running backfills in incremental models, “obsolete records” may persist Help incremental , snowflake	1	3886	April 5, 2023
Incremental model runs only like "create or replace table..." Help incremental , databricks	6	3925	October 24, 2023
full-refresh is not rebuilding the table but replacing it with only new data Help databricks , dbt-core	3	869	July 24, 2024
How to do incremental delete while using incremental model in dbt for snowflake table Archive	1	11051	May 17, 2022

Full refresh incremental model on subset

Related topics