does dbt run any additional macros at run time after the manifest.json is built?

karakanb · May 24, 2023, 3:56pm

I am interested in analyzing complex dbt pipelines and integrate them with our in-house orchestrator. In order to do this, I noticed that dbt already builds a manifest.json file that summarizes the whole pipeline as an execution plan. This seems like a good way of integrating a dbt-built pipeline with other tools where running dbt run is not an option. If I can confirm that dbt doesn’t do any additional operations at runtime, and whatever is built after dbt compile is what’ll be executed during a run I can change the runner to be sth other than dbt.

My question is: does dbt do any additional calculations / materialization / running macros after the manifest file is built?

Surya · May 25, 2023, 9:22am

yes, it runs lot of macros but the below are some examples
it runs below macros for incremental materialization with dbt-snowflake adaptor
ex:- dbt_snowflake_validate_get_incremental_strategy , dbt_snowflake_get_incremental_sql

karakanb · May 25, 2023, 3:12pm

is there a way I can check which macros are used during runtime vs which in build time?

Surya · May 25, 2023, 4:21pm

I think it’s difficult to find,
Perhaps @joellabes can give correct explanation here.

TomC · May 26, 2023, 8:43am

Unfortunately no, it’s not possible to run dbt correctly without calling dbt run or dbt build.

There’s a detailed reason laid out here: Run dbt without compiling. The TLDR is:

Because model SQL may be dynamically templated based on the results of a previous model, there’s no way to pre-compile all the SQL and ship it off for execution elsewhere—dbt needs to be involved from beginning to end.

You mentioned you’re using an in-house orchestrator, so I’m not sure what a solution might look like for you. Perhaps if your orchestrator supports running a docker image, you could package your dbt project into an image (along with the dbt-core package), and build your models that way.

karakanb · May 26, 2023, 10:58am

Understood, thanks a lot for the detailed answer. One of the options is to package it like you said and run the whole dbt pipeline there, but ideally I’d like to be able to mix and match things so that another asset can depend on an intermediate model dbt produces, etc.

Is there an example where I can see such dynamic dependencies?

TomC · May 26, 2023, 11:26am

The most common approach I’ve seen to solve for that kind of “mix and matching” is to split the dbt run into separate layers. Eg if you have three layers base, intermediate, and marts you might do three separate dbt jobs:

dbt run -s tag:base
dbt run -s tag:intermediate
dbt run -s tag:marts

Depending on your orchestrator you may also be able to use “sensors” similar to how the Sensors work in Airflow. If that’s possible you could create a sensor that waits for a specific dbt model (ie table) to be updated.

karakanb · May 26, 2023, 2:22pm

that’s a great example, thank you.

do you have any pointers about the case you mentioned before, where dbt models can depend on the outputs of a previous model?

Topic		Replies	Views
Run dbt without compiling Archive	2	8947	January 12, 2022
execute dbt models from macro Help macros , dbt-core	1	1248	March 15, 2024
DBT Core CI/CD Orchestration Help	1	1033	August 15, 2023
Timing of Macro Expansion During dbt run: Sequential or Batch Processing? Help macros	0	69	October 29, 2024
dbt compile: order of queries to run Help	5	126	December 19, 2024

does dbt run any additional macros at run time after the manifest.json is built?

Related topics