Hello, I’m trying to set up my local development environment so that I don’t have to rebuild upstream models every time I need to make a change or create a new model that uses models that are already built in the production environment.
I just learned about the defer feature and it looks like it will solve all my problems, but I’m a noob at this so I want to make sure that I set up everything correctly and don’t mess up my data.
Here’s how I run my project at the moment:
Daily job builds all models through dbt Cloud (free account) on the production environment
I use Snowflake with two databases - dbt_PROD and dbt_DEV
dbt_PROD runs from the master branch of the project repo on Github
I use dbt Core for local development on dbt_DEV database, and this is done through branches other than master, as I have disabled pushing new code directly to master branch
“target” folder is included in .gitignore, so manifest is the same on every branch
I would like to use the production manifest.json in the defer flag so that I don’t have to rebuild upstream models.
My question is - is it enough to just checkout to master branch and pull the latest data from the repo, and then run “dbt compile” before doing any development on other branches where I use defer flag? Will that ensure that I use the production manifest file for --state?
Is this the right way to do what I’m trying to do, or is there a better way to get the production manifest for deferring purposes?
Also, is there a way to make sure all dbt build/run commands use the defer method by default on the dev environment? Would that be a bad practice? I would like to make this easier for future collaborators on the project and avoid them having to add this flag every time.
I tried testing this and it doesn’t work, it still builds upstream models. So I figured that it does “dbt compile” when I build the new model, so I tried copying the manifest into a separate folder and tried running “dbt build -s +my_model --defer --state prod_manifest” and it is still building the upstream models.
Even if I run “dbt compile --target prod” and fetch the production manifest, it still doesn’t work. :S
@lchien063 Hello, I am not sure if i understood well what you mean by ‘re-compile when pushed into master branch’. I am just hoping that you could provide me some insight for my case. I have two branches which are production and acceptance (main). I keep production manifest.json in s3 bucket and when there is a pull request from developer’s own branch into acceptance, i do defer-state to production manifest.json with acceptance as target profile. So dbt command looks like “dbt build --select state:modified+ --defer --state=prod-run-artifacts --target acc”. As you can see it defers to production manifest.json (stored under prod-run-artifacts) and use acceptance as target profile. The reason we are doing this :
We want to build changes models (with downsterams) in acceptance database before deploying it in production database. And to find which models are changed or created, we use production as reference through production manifest.json
However, when the above dbt build command is run we see that all upstream models starts being run which is of course not ideal, because we want to save cost and compute.
Do you have any idea why this is happening? How to make sure defer state works as expected when we use different target like in our case?
@mare011 Dear Marko, since I dont see your reply anymore may I ask if you have found a solution to your problem? If so, could you please share it because i am having the same issue when i use defer-state with production manifest and with different target environment?
This is not exactly how defer should be used. With the command you specified: “dbt build -s +my_model --defer --state prod_manifest” you actually say you want to build all models, where you actually want to build “my_model” only, and make it refer to the preceding models using links.
I think you are almost there, it seems you need to run “dbt build -s my_model --defer --state prod_manifest” (without the +).
For the test operation, your + will work, because then the defer will just make sure that you are testing the production tables.
“dbt test -s +my_model --defer --state prod_manifest”
For “run” and I believe “compile”, it will behave as the “build” example.
Hi, sorry for the late reply but I think your approach is correct.
I think your approach is correct. This is the differences between dbt run and dbt build.
When I use dbt buid it also run all upstream models as well, however, dbt run only select modified and downstream models.
If you need to test the models, you can split build command into dbt run and dbt test separately