Concurrent dbt runs

We use airflow to orchestrate our dbt runs.

Currently we have our entire pipeline run once a day. Though, we have requests to increase the frequency of the runs up to once every 15 mins for parts of the pipeline.

My question is if we have multiple dbt runs for various parts of the dbt pipeline running parallel orchestrated by Airflow, is there any chance of data loss or locks etc ? Are there any disadvantages to having multiple runs of dbt running at the same time ? What happens if the full dbt run runs at the same time as the smaller dbt run which runs every 15 mins ? We have some incremental models as well. Airflow is running in composer.

Our warehouse is Bigquery. Thanks.

1 Like

I don’t think multiple dbt instances/runtimes would have any knowledge of each other, so I’m not sure how you would prevent concurrent runs. If a single run w/ many threads works, that would be no problem.

Otherwise (thinking out loud, since I need to work out a similar solution), maybe if there was a convention to only run a single model at a time per-dbt-run, you could write out a lockfile (or use a kv store) to show that that particular model is being run/refreshed, so others should fail to lock and skip execution.