Tips and Tricks about working with dbt

emilie · March 27, 2019, 12:50pm

Working with dbt for a while, one can start to develop workflows that are really useful. It would be great to collect these somewhere, so that beginner dbt-ers can benefit from those experiences.

Here is my tip:

When I was really into R, I would use the beepr package all the time because then I could switch onto a new task without losing site of my priority. Many moons ago, I asked about a built-in dbt beep and Drew taught me the dbt run .. && say beep.

What is something you have in your workflow that might make it easier for someone to work in/around dbt?

drew · March 27, 2019, 1:21pm

Nice! The one trick I really like involves using some bash magic to run all of the changed files on a branch.

Snippet:

dbt run --models $(git diff --name-only | grep '\.sql$' | awk -F '/' '{ print $NF }' | sed 's/\.sql$/+/g')

You can save this in your .bashrc with a function, eg:

function dbt_run_changed() {
    children=$1
    models=$(git diff --name-only | grep '\.sql$' | awk -F '/' '{ print $NF }' | sed "s/\.sql$/${children}/g" | tr '\n' ' ')
    echo "Running models: ${models}"
    dbt run --models $models
}

This function takes an optional argument, +, that will also run the children of the changed models!

Usage:

$ dbt_run_changed
$ dbt_run_changed +

claire · March 27, 2019, 6:02pm

Here’s my hot tip!

I’m a huge fan of keeping my directories organized in a tree structure. Since I do client work, my tree structure ends up looking like:

.
└── fishtown
    ├── client
    │   ├── stark-industries
    │   │   ├── stark-industries-dbt
    │   │   └── stark-industries-lookml
    │   └── wayne-enterprises
    │       └── wayne-enterprises-dbt
    ├── dbt
    └── packages
        ├── segment
        └── utils

This can be annoying to cd into the right folder, so I use the goto utility to set up aliases for my folders.
Then I can do things like:

$ goto stark
$ pwd
./fishtown/client/stark-industries/stark-industries-dbt

I also recently set up a bash alias to open GitHub for my current directory (not tested on GitLab! Sorry @emilie) – instructions here.

claire · March 28, 2019, 2:12pm

Oh and another one…

Sometimes if my dbt run is resulting in an error, I like to cycle my logs/dbt.log file before trying to run it again, to make debugging easier.

I have a pretty rudimentary bash function that does this for me:

function cycle_logs() {
  suffix=$(date '+%Y-%m-%dT%H:%M:%S')
  mv -v logs/dbt.log logs/dbt.log.${suffix}
}

emilie · April 1, 2019, 10:40pm

I just set up goto! It’s great! +1 this tip!

joshtemple · April 3, 2019, 4:54pm

Drew, do you think this approach could be used in a CI framework to only build/test/deploy only the models changed by the commits being merged into master?

I’m imagining diffing the feature branch against master and only running those models (with @ and +) in production. Could speed up build/test/deploy times a lot by leaving out the models that aren’t affected.

drew · April 3, 2019, 5:08pm

Cool idea! This would definitely be a good starting point, but it won’t hold up for changes to macros or dbt_project.yml, for instance. dbt doesn’t currently provide a rock-solid way of understanding which models are impacted by changes to a given set of files, but that’s certainly something it could do some day!

ryantuck · July 2, 2020, 5:53pm

Here’s a basic run-and-test function I’ve got in my ~/.bash_profile:

dbtrt() {
    echo "Running and testing: $@"
    dbt run -m "$@" && dbt test -m "$@"
}

thoren · July 2, 2020, 6:02pm

Here’s another bash function:

# find and open in emacs
find_and_open() {
    find . -type f -iname "*$1*" | xargs emacsclient -nw -a ''
}

This can be configured for other editors too. E.g. For Sublime just change the ending to ... sublime -w

anton.sol · May 8, 2023, 12:37pm

That’s neat! I noticed that this does not run newly added models, so I modified the code to include all modified and newly created (but not yet commited) models.

function dbt_run_changed() {
    children=$1
    models=$(git status --short | grep -E '^( M|\?\?).*\.sql$' | awk '{print $2}' | xargs -n 1 basename | sed "s/\.sql$/${children}/g"| tr '\n' ' ')
    echo "Running models: ${models}"
    dbt run --models $models
}

This function takes an optional argument, +, that will also run the children of the changed models!

Topic		Replies	Views
dbt run vs dbt build Help dbt-core	1	341	September 12, 2024
slim CI - Running only modified models on deploy w/ dbt core Help ci-cd , orchestration-and-deployment , dbt-core	0	374	January 27, 2025
Announcing DBT-Helper Archive	3	4375	May 7, 2019
How to avoid whole dbt package run when one files got modified in bitbucket Archive	2	2267	March 4, 2021
How to mimic dbt Cloud "run sql" on the CLI Archive	4	5418	June 29, 2020

Tips and Tricks about working with dbt

Related topics