I’m designing a blue/green deployment for our dbt project. We will have two environments PROD and DEV.
When anyone merges a PR into the develop branch a “slim deploy” gets triggered into DEV.
When we want to release a new version of our models we will trigger the “deploy to PROD” job, which will run dbt clone
cloning all our models into the production database.
Everything works, but the problem is that views are also cloned and we end up with models that have
select
*
from
development_db.schema.upstream_table
This means that if the upstream table changes in development, the data in production will also change.
To solve this, I want to modify te “deploy to prod” job into two actions:
- dbt clone
- dbt run all views
This will re-compile the code for the views making it point to the right database. Howver, I don’t know how to run only models that are materialized as views.
Can I run dbt run --select
only for models materialized as views??
This link also has an alternative solution to my problem: Performing a blue/green deploy of your dbt project on Snowflake
which is basically overriding the ref
macro to omit the database. But I would rather not do that (I don’t know if it would have unintended consequences)
I just found the answer in another discourse conversation. I can just do:
dbt run --select config.materialized:view
2 Likes
@Gonzalo933 glad you got that sorted out. We currently use the alternative solution (i.e., overriding ref
), which we inherited from a previous maintainer. Has definitely caused some annoying side effects, and I’d love to use clone instead.
-
Curious what your deployment macro looks like, if you don’t mind sharing!
-
Also curious if there’s a chance of downtime in production in between the clone step and the run all views step, if a data consumer tries to query a view that hasn’t been rebuilt yet.
Sure, I can share how I have set everything up.
We have two environments, development and production. Each one points to a different DB
- For development we have the classic Slim deploy. We do:
dbt seed --select state:modified --full-refresh
dbt build --select state:modified+
The job is running in Deferred mode against the Development environment.
- For production we have a Clone to prod job that runs:
dbt clone --full-refresh
dbt run --select config.materialized:view
The job is running in Deferred mode against the Development environment. That way the clone command uses “development” as the state to clone.
As for if there is a change of downtime, I don’t think so, for a few reasons:
- Our marts models are all tables, so they get replaced with the clone command. We have views for debugging purposes in intermediate models or for advanced users.
- Even if someone tried to query a view while the job is deploying, because development and production are clones, a view in production that points to development would contain the same data.
Awesome - thanks for the detailed explanation!