Setting up VSCode to use with the dbt CLI

:information_source: For the most-up-to-date version, you might want to go here: Using VSCode with dbt | dbt-sqlserver-docs

Intro

When our team first started using the dbt CLI, we started with Claire’s well-loved discourse post, How we set up our computers for working on dbt project. The post details how the dbt team uses Atom and iTerm 2 on macOS for an improved workflow. Many folks commented on how they acheived similar productivity using VSCode. I thought I’d consolidate some of this into a single article, and expand on it given the recent developments. I’m also going to add things to make it easier for working with Azure databases such as the Azure CLI and Azure Data Studio.

Goals

Following this guide will enable the following behavior in VSCode (some points lifted from Claire’s guide – linked above)

  • a stable, reproducible Python environment for running dbt
  • syntax highlighting for SQL files with jinja in them
  • graying out the text of files that have compiled code in them, to help prevent you from editing compiled SQL (as opposed to your actual model)
  • quick switching b/w a model file and it’s compiled and run counterparts
  • allow

Prerequisite

If you’ve never used VSCode with Python, I strongly recommend at least the first half of Dan Taylor’s Get Productive with Python in Visual Studio Code talks. It covers a lot of the basics like installing Python, the Python extension, and the command pallette.

You should also have the following installed:

  • Git
  • VSCode
  • Python 3 (via anaconda, brew or python.org )

In VSCode you’ll also need to install the Python extension

If you already know VSCode

Here’s a gist for an example .vscode directory that contains a settings.json and an extensions.json

Getting started

To get started, we’ll use the jaffle_shop repo, a self-contained project.

You can use the Git CLI or the VSCode Git extension to Git Clone command in VSCode

git clone https://github.com/dbt-labs/jaffle_shop.git

Then, open the jaffle_shop/ directory in VSCode.

Python environment

Goal

The goal of this section is to ensure that the right version of Python and dbt are always available right away when you open your dbt project in VSCode. Sounds simple, but below is a one-time setup guide on how to make it work. More context is that some folks have bundled this set up process into bash scripts and Docker containers.

Some folks deem this problem so difficult as to justify having users use Docker containers, but I have yet to be convinced of that yet.

Walkthrough

Python can be tricky get working in VSCode (and trickier to work on Windows). You OS likely already has a version of python installed, but this can be troublesome because you don’t control it’s version.

It’s better practice to have a dedicated dbt environment. Three popular tools are venv’s, virtualenv’s and conda environments. Our team uses conda envs because we have many different projects with different sets of package requirements, but if dbt is 1) your only use case for Python, or 2) your first Python-based use case, you’ll likely have a better time with virtualenvs. I’m going to only talk about venv because it comes built-in with Python

Open a terminal with `CTRL+`` (which should open within the jaffle_shop directory) and do the following steps:

# make sure you have Python at least 3.6 and less than 3.10
# Create and activate virtual environment
python3 -m venv .dbtenv
source .dbtenv/bin/activate

# install the dbt package you want
pip install dbt-synapse # or dbt-sqlserver or whatever

# make Git ignore all these newly created files
echo '.dbtenv/' > .gitignore

Once you’ve done this you should now be able to:

  1. bring up the command pallette (CMD+SHIFT+P)
  2. search for “Python: Select Interpreter”, and
  3. Pick the .dbtenv environment (should be the first result)

Those three steps will:

  1. activate the Python extension if it hasn’t been already
  2. ensure that all new terminals opened in VSCode will auto-activate your .dbtenv environment

This is huge because now all your terminals in the VSCode will always have your dbt package available. However, this behavior will not persist the next time you open this repo in VSCode. To make this auto-env selection persist, you must do two things:

  1. add a requirements.txt to you the top level of the repo (pip docs on requirements.txt files)
  2. (optional) add to the requirements.txt what packages w/ versions you plan to do in this project (example below)
  3. create a new file .vscode/settings.json and add the Python path to the settings.json (more on VSCode settings later!)

requirements.txt

dbt-synapse==0.19.2
sqlfluff==0.7.1

.vscode/settings.json

{
    // change this to your desired path!
    "python.pythonPath": "./.dbtenv/bin/python",
}

Now that you’ve done these two things, everytime you open the jaffle_shop/ dir, in VSCode two things should happen:

  1. the Python extension activates right away (do you see the Python version listed alongside your environment name on the bottom info bar?)
  2. any terminal you open will auto-activate your .dbtenv and each line should begin with (.dbtenv)

You should test this by closing VSCode, then opening the jaffle_shop repo

Extensions

In this section, I’ll go over some of the extensions that our team uses. Each extension requires that you install it from within VSCode, and most will require adding additional settings to your .vscode/settings.json

vscode-dbt

the vscode-dbt extension is great because it provides a few things:

  1. syntax highlighting for SQL with jinja in it (also for .md’s and .ymls), and
  2. helpful jinja snippets will save you a lot of time

To get this working you should add the following to your .vscode/settings.json

There’s an optional addition I strongly recommend "**/target/**": "",, which will not do any syntax highlighting/colorization to any file in the target/ folder. This prevents me from making the classic mistake where I start editing a compiled model file, instead of the original model file. Then when I call dbt run my changes aren’t incorporated, but instead overwritten by the unchanged logic of the model file. with this setting you know something is wrong then the sql has no coloring.

"files.associations": {
    // the pattern on the left side can be whatever you want: e.g.
    "**/jaffle_shop/**/*.sql": "jinja-sql", // just the .sqlfiles inside of jaffle_shop, or
    "*.sql": "jinja-sql", // all .sql files

    // optional: don't format models in `target/` dir
    "**/target/**": "",
    // I don't personally use these, but you can also have jinja work for `yaml` and `md` files
    "**/<dbt-project-dir>/**/*.yaml": "jinja-yaml",
     "**/<dbt-project-dir>/**/*.yml": "jinja-yaml",
    "**/<dbt-project-dir>/**/docs/**/*.md": "jinja-md"

    // the vscode-dbt docs say you may need this
    "editor.quickSuggestions": {
    "strings": true
    }
}

You’ll know it is working when you open a .sql model and, in the bottom toolbar on the right it says now says “Jinja SQL” instead of “SQL”.

Find Related

the find-related extension allows you to use regular expressions to correspond a .sql file in your models/ directory to it’s compiled and run counterparts in the target/ folder. I find this a huge timesaver compared to manually naviagting the target/ dir in the explorer sidebar.

After you install the find-related extension, you can enable it by adding the following to your .vscode/settings.json. There’s no dbt or jinja magic going on here, just regex. So you may need to tweak these settings if they’re not working for you.

Once it is set up, you can type Option+R on any model file to jump to it’s compiled version. While on a compiled model file, Option+R will take you to it’s target/run counterpart.

{
    // this is so you can easily jump to your compiled SQL files
    "findrelated.workspaceRulesets": [
        {
            "name": "sql",
            "rules": [
                {
                    "pattern": "^(.*/)?models/(.*/)?(.+\\.sql)$",
                    "locators": [
                        "**/compiled/**/$3"
                    ]
                },
                {
                    "pattern": "^(.*/)?compiled/(.*/)?(.+\\.sql)$",
                    "locators": [
                        "**/run/**/$3"
                    ]
                },
                {
                    "pattern": "^(.*/)?run/(.*/)?(.+\\.sql)$",
                    "locators": [
                        "**/models/**/$3"
                    ]
                }
            ]
        }
    ],
    "findrelated.applyRulesets": [
        "sql"
    ]
    }

Rainbow CSV

the rainbow-csv extension just highlights csvs where each column is it’s own color. It’s great to use when you have a csv where character-width varies greatly within a column. You can also hover over a value to see what column it belongs to. Very helpful for seeds!

SQL Fluff

Our team has recently implemented sqlfluff linting for our dbt projects, especially because versions 0.6.5 and greater now support TSQL. There’s also a great VCcode extenstion.

If you already have a .sqlfluff and .sqlfluffignore configured and working, it is enough to install vscode-sqlfluff and add the following to your settings.json

    // you get this by calling `where sqlfluff` after calling `pip install sqlfluff`
    "sql.linter.executablePath": "<PATH_TO_YOUR_SQLFLUFF_FROM_WHICH_SQLFLUFF_COMMAND",
    "sql.linter.run": "onType" // alternatively "onSave" if you'd like it less frequent 

dbt Power User

I personally haven’t had time to use the dbt Power User extension, but folks have good things to say. I hope to try it out soon and folks are more than welcome make a Pull Request to this doc as they see fit.

Settings

Extra settings

Here’s some other settings that I recommend:

        // easier to see if there are unsaved changed
        "workbench.editor.highlightModifiedTabs": true,
        "workbench.editor.labelFormat": "medium",
        // make Command Prompt the default shell for Windows instead of Powershell
        "terminal.integrated.shell.windows": "C:\\Windows\\System32\\cmd.exe",
        
        // make a vertical line so I don't make lines too long
        "editor.rulers": [80],
        // show whitespace as dots
        // (easier to count out indentation and spot trailing whitesapce)
        "editor.renderWhitespace": "all",

Workspace-level settings files

Sometimes it isn’t convenient to have a .vscode/settings.json, such as when you:

  1. have a subset of settings under source control that you’d like all users to be using (it doesn’t make sense to source control your specific Python path)
  2. you prefer multi-root workspaces a.k.a. more than one repo open at at time (great for when you also want your .dbt/profiles.yml close at hand)

A worksapce settings file has the extension .code-workspace and encapsulates all the configuration you might find in a .vscode/ dir into a single file. This file also works as a shortcut that you can double click or navigate to to bring up all your settings.

If someone wants more info on this free free to open an issue. For now I’ll leave this as as stub.

12 Likes

This seems to be on the wrong nesting level. It gave en error till I moved it outside of "files.associations" block

1 Like

Search rules for related files seemed too limited for me:

  • if I am in models, why can’t I look for related files in run folder?
  • more often I am looking for .yml counterparty of the model rather than its compiled version => new locators and changed precedence
  • going back from .yml to the model also ultra-useful => new set of rules
  • I rarely do an error of editing compiled files, but I like them to be formatted as what they are - SQL with dialect specification
  • There are some more settings of the plugin at the end
        "findrelated.workspaceRulesets": [
            {
                "name": "sql",
                "rules": [
                    {
                        "pattern": "^((?!.*target).*/)?(analysis|data|macros|models|tests)/(.+/)?([^/]+)\\.sql$",
                        "locators": [
                            "**/$4.yml",
                            "**/compiled/**/$2/$3$4.sql",
                            "**/run/**/$2/$3$4.sql",
                        ]
                    },

                    {
                        "pattern": "^(.*/)?compiled/([^/]+)/(.+/)?([^/]+)\\.sql$",
                        "locators": [
                            "$3$4.sql",
                            "**/run/$2/$3$4.sql",
                            "**/$4.yml",
                        ]
                    },
                                        {
                        "pattern": "^(.*/)?run/([^/]+)/(.+/)?([^/]+)\\.sql$",
                        "locators": [
                            "$3$4.sql",
                            "**/compiled/$2/$3$4.sql",
                            "**/$4.yml",
                        ]
                    },
                ]
            },

            {
                "name": "yml",
                "rules": [
                    {
                        "pattern": "^(.*/)?(analysis|data|macros|models|tests)/(.+/)?([^/]+)?\\.yml$",
                        "locators": [
                            "$2/$4.sql",
                            "$2/**/$4.sql"
                        ]
                    }
                ]
            },
        ],
        "findrelated.applyRulesets": [
            "sql",
            "yml",
        ],
        "findrelated.autoOpen": false,
        "findrelated.autoPreview": false,
        "findrelated.openPreview": false,

NB: in (analysis|data|etc)list all subfolders of the project that can contain buildable SQL. E.g. “data” was common before dbt version 1.0.0 and would likely be substituted to “seed” in newer versions

1 Like

Thank you for the great article.

Making sqlfluff work

To make sqlfluff work I also had to do a few extra things in addition to what you described.

Sqlfluff has multiple ways deal with jinja-sql:

  1. A dbt based templater that uses dbt to compile your jinja-sql. This generates more accurate sql and automatically uses all your macros and variables, but is fairly slow.
  2. Their own jinja templater, which requires extra work to make dbt work, but is very fast

In addition to that, the vscode sqlfuff extension doesn’t work with the dbt templater.

But since the dbt templater is quite slow, it’s probably not the best choice for use inside of vscode anyways.

But it also means you have to use some workarounds to make your dbt variables, macros and packages work.

Configure sqlfluff

To configure sqlfluff I had to create two files:

.sqlfluff:

[sqlfluff]
# Run `sqlfluff dialects` to get a list of all dialects
dialect = tsql

# Use the dbt templater
# templater = dbt

# Use jinja templater instead of dbt
templater = jinja

[sqlfluff:templater:jinja]
# Make dbt default macros like available like `ref`
apply_dbt_builtins = true

# Load custom macros from the macro directory
load_macros_from_path = ./macros/

# Allow mocking dbt_utils and other packages through
# python files placed in this dir
library_path = ./sqlfluff_libs/

# Mock variables that are used in dbt
[sqlfluff:templater:jinja:context]
target = prod

.sqlfluffignore

target/
dbt_modules/
dbt_packages/
macros/

Make dbt_utils work

The above config worked, but when I ran sqlfluff against a model that used dbt_utils.surrogate_key(), I received this error:

L:   1 | P:   1 |  TMP | Unrecoverable failure in Jinja templating: 'dbt_utils'
                       | is undefined. Have you configured your variables?
                       | https://docs.sqlfluff.com/en/latest/configuration.html
L:  17 | P:  12 |  TMP | Undefined jinja template variable: 'dbt_utils'

I was able to fix it by creating a python library that mocks dbt_utils package macros:

sqlfluff_libs/dbt_utils.py:

def surrogate_key(field_list):
    return "'0xMockKey'"

To mock more complex and nested libraries, you can refer to:
https://docs.sqlfluff.com/en/stable/configuration.html#library-templating

A broken link…

I also noticed the link a gist for an example .vscode directory doesn’t seem to work, because you changed your github name.

The working link is: .vscode folder for getting started with dbt in VSCode · GitHub

2 Likes

Hi all,

Its been a while since this thread has been updated, and I just wanted to say that the best way (IMHO) to setup dbt to work with vscode has been to install the dbt-power-user extension along with some other extensions like sqlfluff, vscode-dbt, rainbow-csv etc.

[self promo warning] For those using Big Query, however, I’d like to present my vscode extension, vscode-dbt-bigquery-power-user which integrates an sql results preview pane

Its actually a fork of the dbt-power-user extension plus some other extensions (vscode-bigquery and vscode-query-runner which I integrated into this plus some of my own bag of tricks.

Some of the highlights of this extension are:

  • A Show Compiled SQL menu icon that opens the compiled sql version of your models

  • An Open Query Runner menu icon that runs the compiled sql and displays the results in another panel

  • The Query Runner panel that displays the results of the query.

    • The Rerun Query button which allows you to update the model in your original panel and rerun the query (including the compilation) as well as the Results navigation buttons to see the next/prev page of results.
    • The Download buttons which allow to display the results (with a choice of JSON, text, or csv formats) as well as download the data into the logs/results folder of the dbt-project.

Its still in alpha release, meaning it hasn’t been tested extensively but I’ve been dogfooding it in my own side projects, and I’m now looking for more feedback.

For those vscode users using dbt with bigquery , I hope you find it useful and give me some feedback.

2 Likes

I read the entire thing still clueless about how to set up VScode to use with dbt…

Can you provide a bit more context? Which parts specifically caused you difficulty?

A post was split to a new topic: Debugging run-operation

"python.pythonPath" setting is not supported anymore. Use "python.defaultInterpreterPath" :+1:

Clicking on “Show Model Graph” gives error: command 'showModelGraph' not found. Anyone else encountered this?

I installed dbt in a conda environment; I work on Windows and use git bash as a default terminal. dbt build and other buttons work correctly.

P.S. Thank you for the guide, was very helpful to get started.

UPD: according to the slack, it’s probably a bug in one of the new features; hopefully will be fixed soon

Yeah… I don’t find this very helpful. This is not for people who are new to VS Code and have no idea what you are talking about or how to go about executing what you are talking about. Some basic thoughts/questions I had while working through this (excuse my ignorance):

You are giving me code to implement in order to make things happen, but I’ve only ever worked in SQL and some Python, what is this code, where do I input it, where do I execute it?

I’m just going to stop there. I really just needed someone to explain this all to me as if I was a small child attempting to put on pants for the first time.

link does not work: “a gist for an example .vscode directory”
https://gist.github.com/swanderz/5cf876d88c7c8d268d8c1e1e5d05bffd
“a gist for an example .vscode directory”

How do you get VS Code to do some syntax highlighting in the Markdown files with doc blocks?

image

It is sooo difficult to work with a file like this without the text being highlighted differently from the doc block opening and closing statements.

I have tried installing Markdown All in One and Markdown Color Plus extensions, but they do nothing out of the box.

The solution is similar to

the sql file associations section in the settings.json

{
  "files.associations": {
    // the pattern on the left side can be whatever you want: e.g.
    "*.sql": "jinja-sql", // all .sql files
    "**/models/**/*.md": "jinja-md", // all .md files in the models folder will be rendered as jinjasql
    // optional: don't format models in `target/` dir
    "**/target*/**": ""
  },
1 Like