(near) zero downtime deployments of updated models (an alternative to virtual data environments?)

I’m working on a project for (near) zero downtime dbt deployments of updated models. Curious to know if anyone worked on a similar idea!

current problem

we run dbt buillds every hour, but some of our model changes require full refreshes and other expensive operations that can take >1 hour

to avoid data freshness problems, we need a solution that lets us update our models with a new version while the old version is still running, and when the new version is materialized, we want to switch it with the old one

solution idea

(this is slightly inspired by the tobiko data article about virtual data environments)

we’re trying to leverage snowflake’s zero-copy cloning to create “data branches” with clones of the modified models and all its children models, and deferred references of the parent models pointing to prod

the ideal workflow looks like:

  • edit models on dev
  • create branch of dbt ls --models dited_models+ --defer prod using snowflake zero-copy cloning
  • run dbt full refreshes on the new branch to update the models
  • “merge” the new branch into prod by dropping the prod tables and renaming the new branch to the prod namespace (in the same transaction)