As an analytics engineer on Fishtown’s professional services team, I spent the past year working on 10 different dbt projects for customers. I wrote the following guide to help our team easily switch back and forth between dbt versions. There were times when I was working on 2-4 different projects at once and it was cumbersome to operate in an “all or nothing mentality” regarding which dbt version I could work on. I also wanted to help make it easier to test out beta releases of dbt without having to fully commit to using the beta for every project I was actively working on.
I wanted to share this document with you all since it might help your team if you’re managing multiple dbt projects as well, or you’d like to easily test beta releases and keep up with the latest and greatest that dbt has to offer!
TLDR
-
If you’ve ever wanted to get multiple versions of dbt running in your machine (specifically the latest release and maybe the latest pre-release - beta or rc) then you’ve come to the right place!
-
This guide will help you get set up with 2 virtual environments (using
venv
):- dbt - latest official release
- dbt-beta - latest pre release
-
This will help your team:
- standardize development experience - no more “this works on mine but idk why it doesn’t work on yours” problem
- easily upgrade and manage local dbt versions - this should make it easier to both beta test new releases and easily swap back to existing stable releases when bugs are discovered
Why should I bother?
What are Virtual Environments?
Real Python does a great job at explaining what virtual environments are and why we need them in this primer (feel free to loop back to this later):
Python Virtual Environments: A Primer - Real Python
For now, the most important takeaway here is the concept of project isolation. Being able to isolate your projects allows you to install and handle multiple, potentially conflicting, package versions in one machine at the same time.
How will using Virtual Environments help your team?
In the same vein as having code style guides, organizing project directories consistently (using a ~/dev/
directory), and version controlling analytics, this is just another way to aid in improving the developer experience and preventing unexpected code breakage. Here are a couple ways it could benefit your team:
- Everyone will have the exact same local installation of dbt, in an isolated environment, guaranteeing consistency across all your machines. No more “I don’t understand why it doesn’t work on your machine, it works on mine ”
- You can now easily hot swap between dbt versions. If you want to run the latest stable release, you can do so, if you want to easily try the same project on beta, you could do so also! All this with much fewer commands than those required to upgrade or downgrade dbt.
-
We can have a script manage your environments and make updating easy! If everyone has the same environments set up, it makes it really easy for people to build scripts to help manage that across all your machines. (And this is what we did here!) No need to remember to upgrade both the dbt stable release and also the beta, just run a simple command (like
dbt-update
) and voila, you have both environments updated!
Why venv
? Why not pyenv-virtualenv
or conda
?
While it’s true that the other two alternatives (and many more) are likely better, all other alternatives require installing another package whereas venv
comes shipped with the latest versions of Python out of the box. This makes it easy for us to use this across all your machines without having to worry about installing other bloatware that might not be useful to us. Keeping this as simple as possible will also help maintainability!
Count me in! Where do I get started?
If you’re convinced this is the right solution, here’s a quick guide to get you set up!
-
Copy the contents of this gist:
-
dbt-update.sh
#!/usr/bin/env bash DBT_ENV=~/.virtualenvs/dbt DBT_BETA_ENV=~/.virtualenvs/dbt-beta process_environment() { env=$1 release=$2 if [[ -d "$env" ]]; then echo "" echo "There is an existing dbt environment in: $env" echo -n "Would you like to update(u) or reinstall(r)? [u/r]: " read ans if [[ $ans == "r" ]]; then echo "" echo "Reinstalling!" echo "Deleting existing environment" rm -rf $env && echo "Successfully deleted existing environment" elif [[ $ans = "u" ]]; then echo "Updating!" else echo "" echo "Exiting script" exit 1 fi fi echo "" echo "Creating dbt environment in: $env" python3 -m venv $env && echo "Successfully created dbt environment!" echo "" echo "Activating your dbt environment and installing the latest dbt version" if [[ $release == "stable" ]]; then source $env/bin/activate && pip install dbt -q -U && echo "Successfully installed dbt:" elif [[ $release == "pre" ]]; then source $env/bin/activate && pip install dbt -q -U --pre && echo "Successfully installed dbt:" else exit 1 fi dbt --version deactivate } echo "" echo "Initializing dbt environments" echo "" echo "=== main dbt environment ===" process_environment $DBT_ENV stable echo "" echo "=== dbt beta environment ===" process_environment $DBT_BETA_ENV pre echo "" echo "If you would like to get the commands to set your aliases" echo -n "to easily activate the environments respond with your shell or skip, [bash/zsh/skip]:" read ans_alias if [[ $ans_alias == "bash" ]]; then profile=~/.bash_profile elif [[ $release == "zsh" ]]; then profile=~/.zshrc else exit 1 fi echo "" echo "If you don't already have these aliases to quickly activate the dbt environments" echo "run the following commands in your terminal then restart your terminal:" echo "echo \"alias dbt-activate='source $DBT_ENV/bin/activate'\" >> $profile" echo "echo \"alias dbt-beta-activate='source $DBT_BETA_ENV/bin/activate'\" >> $profile"
-
-
Paste the contents in a file in your
~/.dbt/
directory called:dbt-update.sh
. This is the same folder where you’d find yourprofiles.yml
file! -
Run the following commands in your terminal:
You can figure out your shell by checking the top of your iTerm2 window to see what’s written, should be either
zsh
orbash
-
If you’re using
zsh
(the new macOS default since macOS Catalina)chmod +x ~/.dbt/dbt-update.sh # this makes the file executable echo "alias dbt-update='~/.dbt/dbt-update.sh'" >> ~/.zshrc # this allows dbt-update to be run from anywhere
-
If you’re using
bash
(disclaimer, I haven’t tested this, let me know if you’re trying it, would love to pair)chmod +x ~/.dbt/dbt-update.sh # this makes the file executable printf "\nalias dbt-update='~/.dbt/dbt-update.sh'" >> ~/.bash_profile # this allows dbt-update to be run from anywhere
-
-
Restart your terminal by closing the window and opening a new one (you could also run the
source
command against your config file) -
Run the following command:
dbt-update # yes it's that simple!
You should see some printed status messages letting you know what’s going on underneath. The script will now proceed to set up the
dbt
anddbt-beta
environments for the first time, as well as install fresh dbt versions in each environment. At the end of each install, you should see the script print out the version of dbt that was installed. It should look something similar to this:
-
Once your environment is set up, run the two commands that the script suggests you run in your terminal after it’s been installed.
-
Restart your terminal one last time and you now should be able to run:
dbt-activate
and you will get something that looks like this:
After running the activate command, you should be able to see your terminal prefixed with the environment name (in this case
dbt
). You can do the same for thedbt-beta
environment, just run:dbt-beta-activate
-
To deactivate your environments, just run:
deactivate
-
The next time you need to update dbt, just re run:
dbt-update
. It should detect that you already have the environments set up. It will then prompt you to choose whether you’d like to update or reinstall (type any other character to cancel out).
Improving this workflow
I’m looking to see if anyone is familiar with how this could be deployed in a smoother way? Right now it requires that people have some familiarity with creating custom bash scripts and setting up aliases on their command line tool of choice. I would love to see some way to just distribute this across multiple machines with some simple install and maybe add some sort of flexibility so that users could potentially manage versions by just using some configurable options, instead of making edits to the raw dbt-update.sh
file. Feel free to reach out here and write some thoughts if this is interesting to you or if you’ve tackled the same problem differently in your organization!
Other Helpful Links
An Effective Python Environment: Making Yourself at Home - Real Python