Hello, my dbt project is starting get large and models are getting used in multiple data flows. It’s great to re-use existing models but scheduling jobs efficiently is becoming a challenge. Currently, we have jobs to build “core” models like this: dbt run --select +mart_1+.
The problem is that a dependency for mart_1 can also be a dependency for mart_2. Mart_1 and mart_2 only really have 1 common table. If we schedule mart_2 as dbt run --select +mart_2+ then both lineage paths get run entirely. I really only want to run each model distinctly once, but in order of their dependencies.
My thought was to indicate to dbt that I want to only run models N times per day (once in this case). If a job triggers that model to run a second time, skip it and move to the next model. Is there a mechanism to do this or a more elegant way to solve the problem?
I could put the only dependency in as a source but that feels like it could create circular logic as time passes and the project gets more and more nested.