How to run state:modified+ but capture select upstream changes too

luca.odinga · August 12, 2024, 9:53am

The problem I’m having:

We would like to use a deployment strategy that only builds the models that have been modified. However if you only build downstream you run into the problem that one of two upstream tables for an intermediate won’t change, resulting in missing information in your intermediate during the day.

The context of why I’m trying to do this

Our current daily job runs about 250 models. We run it every night to capture all changes. However, during the work hours we would prefer our cd job to only run changed models but don’t want to lose out on some upstream changes. For example:

*stg_order *
stg_invoice

These two combine into int_invoice_orders.

If we change stg_order and only run downstream, we will have orders without invoices in our int_invoice_orders.

Is there a way to run state:modified+ but also check for upstream staging tables of intermediates and run those too?

What I’ve already tried

I’ve looked a the documentation and tried creating a macro using chatgpt and my own brains, but so far i haven’t managed to get to the desired command statement.

Thanks for any help you can provide!
Luca

brunoszdl · August 12, 2024, 1:29pm

You could try something like

dbt build -s "state:modified+ @state:modified,models.staging"

If I am not wrong this should run

the modified models and downstream models (state:modified+)
Models that are at the same time
– Parents to the downstream models (@state:modified) Graph operators | dbt Developer Hub
– Inside the staging folder (models.stating)

I didn’t test it

luca.odinga · August 12, 2024, 5:40pm

Hi Bruno,

Thanks for replying! I’can’t believe I didn’t know about the @ operator. This seems to be exactly what I was looking for.

Thank you!

brunoszdl · August 12, 2024, 5:52pm

@luca.odinga awesome! just be careful with it, because it is supposed to run ’ all ancestors of all descendants of the selected model’, so it can run a lot of stuff.

That’s why I added staging models path with the intersect operator ,

So it limits the parents to the ones that are staging models

luca.odinga · August 13, 2024, 11:55am

Hi Bruno,

Thanks for the heads up. I’ve been testing around with the @ operator and it truth what I think we would need is something like:
all ancestors of all descendants of the selected model AND the descendants of those ancestors
However, that means even more models would be run and that ultimately brings us almost to a full cd job in some cases.
Which at this time using an S size warehouse in snowflake takes almost 6 hours. There is a lot we still need to learn and improve on I’m afraid. Still very much a dbt noob here

Thanks for thinking along! I hope the @ operator will help us out in the end

system · August 20, 2024, 11:56am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
CI job checking all models Help	5	83	April 23, 2025
slim CI - Running only modified models on deploy w/ dbt core Help ci-cd , orchestration-and-deployment , dbt-core	0	258	January 27, 2025
Feature wanted: Select using multiple criteria, negations or set operators to customize models to run accurately. In-Depth Discussions orchestration-and-deployment	1	2756	May 30, 2023
Run only changed models Help	5	12439	February 19, 2025
How to run all models daily and run specific view models when there is a change in them only Help best-practice , snowflake , orchestration-and-deployment , dbt-cloud , macros	2	94	November 28, 2024

How to run state:modified+ but capture select upstream changes too

The problem I’m having:

The context of why I’m trying to do this

What I’ve already tried

Related topics