Did you know? dbt ships with its own project!

Have you ever executed dbt run in a pretty small project and been confused at the number of macros in the output?

$ dbt run
Running with dbt=0.15.0
Found 1 model, 0 tests, 0 snapshots, 0 analyses, 138 macros, 0 operations, 0 seed files, 0 sources

138 macros? Huh?

Well, this is a result of a pretty cool design pattern, which is that dbt ships with its own global project. Then, dbt overlays your own project on top of this project, resulting in the 138 macros! They’re not in your project, they’re in the global project!

If you dive into the source code for dbt, you’ll find a directory called global project (here). In it, you’ll find a file called dbt_project.yml, as well as a folder for docs and macros. Looks familiar right?

This “global” project includes lots of the snippets of Jinja that get used in every project, including:

  • Materializations + the macros contain the relevant SQL to power them (here)
  • Schema tests (which we just define as a macro that is prefixed with test_), e.g. test_unique (here).
  • The macro that determines the name of the schema that a model should be built in, generate_schema_name (here), and the table/view name, generate_alias_name (here).
  • The default text for your overview when you run dbt docs generate, also known as the overview (here).

Take a look around, and you’ll start to see some familiar things!

If you dive deeper, you’ll discover there’s also some magic around default macros and adapter macros, as well as separate projects for each adapter (e.g. plugins for Postgres, Redshift, BigQuery and Snowflake) , but we’ll leave that for another article.

Cool, cool, cool… so why am I telling you this?

Well there’s a pretty nifty design pattern in dbt — if your dbt project has a macro with the same name as one of the global macros, dbt will favor the macro defined in your project over the global implementation. You see this crop up in a few places in the docs:

  • To define your own logic for schema names, add a generate_schema_name macro to your project (docs)
  • To define your own logic for relation names, add a generate_alias_name macro to your project (docs)
  • To write your own overview for your docs, add an overview docs block to your project (docs)
  • To override the default implementation for the unique -ness test (for example, to only test in defined environments), add a test_unique macro to your project (Discourse example). Pro-tip: start by copying the code from the default implementation, and edit from there.

Broadly, I think it’s pretty cool to just understand a little more of how dbt works under the hood, so I wanted to share this explanation.

Another great example of leveraging this design pattern is when you want to make changes to an existing materialization — for example Warby Parker wanted to adapt the create_table_as macro on Postgres to be able to use the UNLOGGED parameter (Postgres docs. They were first able to test out their changes by adding a postgres__create_table_as macro to their own project to override the default implementation. Once they got it working as expected, they were able to contribute it back to dbt!

5 Likes