Updates
- [Sep 27] v0.21.0 (final) is available for production use.
- [Sep 27] v0.21.0-rc2 is available. It includes small bug fixes and bumps to schema versions for changed metadata artifacts.
- [Sep 20] v0.21.0-rc1 is available for prerelease testing.
Who is Louis Kahn ? Check out the release notes for a biography of this famous Philadelphian
dbt Core v0.21 (Louis Kahn) is now available on PyPi, Homebrew, DockerHub, and dbt Cloud.
Our two big areas of focus for this release were a new build
task and reconciling configs and properties. I’ll say more about both below. That said, there are a bunch of other very cool features, including:
on_schema_change
for incremental models: goodbye manual post-merge migrations, goodbye ad hoc full refreshesstate:modified
: upstream macro changes! sub-selectors!dbt deps
keeping you in the know about new package versions
There’s much more where that came from. I’d encourage you to read:
- Migration guide for an overview of new and changed documentation
- Release notes and changelog for the full set of features, fixes, and under-the-hood tweaks
Installation
# with pip, install a specific adapter
pip install dbt-<adapter>==0.21.0rc2
# with Homebrew, install four original adapters
brew install dbt@0.21.0-rc2
Heads up: We will be changing some installation details for the next version of dbt (the one after v0.21). Going forward, we will no longer be supporting pip install dbt
. Please ensure that you’ve switched any production pipelines to pip install dbt-<adapter>
.
Breaking changes
Note that this release includes breaking changes for:
- Freshness checking: The CLI command has been renamed to
dbt source freshness
, and its selection syntax now works like other tasks.- Backwards compatible: The old name (
source snapshot-freshness
) was lengthy and easy to confuse withsnapshot
. The old command name will continue working, but it will no longer be documented. - NOT backwards compatible: The previous selection syntax allowed you to select specific sources by name without the
source:
prefix, which is how standard selection syntax works. If your deployments select specific sources to freshness check, you must add thesource:
prefix.
- Backwards compatible: The old name (
- Snowflake: Remove most transactional logic and turn on
autocommit
by default. We believe this should significantly reduce Cloud Services credit consumption in standard dbt operations. - Artifacts (see schemas.getdbt.com):
manifest.json
has a newv3
schema that includes additional node properties (no changes to existing properties)run_results.json
has a newv3
schema that includesskipped
as a potential TestResultsources.json
has a new v2 schema that adds timing and thread details.
One notable (non-breaking!) change
- All dbt tasks now use
--select
instead of--models
to select resources. Tasks that previously used--models
(run
,test
,compile
,docs generate
,list
) have preserved the old behavior for backwards compatibility.
New task: build
https://next.docs.getdbt.com/reference/commands/build
What does dbt build
do? Well, everything: it runs your models, tests your tests, snapshots your snapshots, and seeds your seeds. It does this, resource by resource, from left to right across your DAG.
In DAG order: it’s worth repeating! If you previously struggled to deploy a dbt project that mixes models and snapshots throughout, this is the task for you.
Tests on an upstream model will block downstream models from running. If any test fails, the downstream models will be skipped. Why? The answer won’t surprise you: We think test failures matter—enough to stop a DAG for.
Geoffrey: You fool! As if it matters how one test fails.
Richard: When the failure’s all that’s left, it matters.
If there are tests in your project that aren’t worth stopping for, that’s totally ok—that’s just what test severity is good for. You can configure those tests with error_if
thresholds (“only stop if you find >100 failures”), or to warn
always and keep dbt a-buildin’.
How will you build
?
Consider that:
- In development,
dbt build --select model_a
will both run and testmodel_a
. (We reworked test selection in v0.20 to avoid surprises, by making sure this syntax doesn’t include tests with other unselected, unbuilt parents.) - In CI, your build-on-PR job could be as simple as
dbt build -s state:modified+
(plus--state
,--defer
, and a production manifest). - In production, your regularly scheduled job could be
dbt build
, plus steps to check source freshness and generate documentation. dbt build
works with all the powerful selection syntax you’ve come to know and love— including yaml selectors, a potent, version-controlled way to define subsets of your DAG. Also new in v0.21: the ability to define default yaml selectors, thereby offering custom control over the “full build” experience (i.e.dbt build
without--select
or--exclude
).
dbt build
is an opinionated task. It’s the culmination of all we’ve built—running models with resilient materializations, prioritizing data quality with tests, updating fixtures with seeds, capturing slowly changing dimensions with snapshots—all for one DAG, and one DAG for all.
We think you should use build
, but you don’t have to—all the existing tasks are still there to mix and match.
Configs and properties
https://next.docs.getdbt.com/reference/configs-and-properties
Previously, we had entire sections of the dbt documentation dedicated to explaining the difference between resource configs and resource properties. It can be hard to remember which is which, and the distinctions are not minor: they’re defined in different places, and configs carry a lot of additional functionality.
Student: Why is
database
a config for models (settable in Jinjaconfig()
anddbt_project.yml
), but a property for sources (settable only in yaml files that aren’tdbt_project.yml
)?
Teacher: Well, you see, configs tell dbt how to do something, whereas properties tell dbt about what something is. dbt creates models, so model location is a how; dbt knows about sources, so source location is a what.
Student: Ok - what aboutpersist_docs
, which is a config? That usesdescription
, which is a property???
Teacher: Well, you see, one property of configurability is that it’s contagious, like a child’s laughter, and so the fact ofpersist_docs
being a config raisesdescription
, as it were, from being a measly property into a config-plus-one, through an alchemical process that our greatest researchers are only beginning to understand…
Okay, okay. So, what’s the change? You can now set resource configs in all yaml files, using a new config
property. Using that property, you can set configurations, just as you can with the in-file config()
macro or in dbt_project.yml
. This is our initial stab at reconciling two different ways to apply significant attributes to models, seeds, snapshots, and tests.
Examples
The big change here is a conceptual one. There are also some specific changes you can make in your projects right now:
Configure column types for one seed
Previously, you could only do this in dbt_project.yml
, with fairly wonky syntax. Now, you can:
# seeds/my_seed.yml
version: 2
seeds:
- name: my_seed
config:
column_types: {my_date_field: date}
Set meta
as a project config, then override it
# dbt_project.yml
models:
+meta:
owner: data_team
important: true
Override it for one model in its yaml properties, or right in its .sql
file:
# models/my_specific_model.yml
version: 2
models:
- name: my_specific_model
config:
meta:
owner: me # overrides
contains_pii: yes # net-new
# inherits `important` from project-level config
# models/my_specific_model.sql
{{ config(meta = {'owner': 'me', 'contains_pii': 'yes'}) }}
select ...
Note that this change is backwards compatible, so existing meta
definitions will keep working. If you want to start using config inheritance, you’ll need to switch meta
from a top-level key to nest it under a config
block.
Limitations
This was a big first step; the work is never done. Some properties are not yet configs, and so lack those capabilities:
- Properties of sources + exposures. I’d love to support setting
database
in asources
key indbt_project.yml
. That’s still not possible, unfortunately, but we’ve taken a big step in the direction of making it so. - Special properties, such as
description
,tests
,columns
. These have different rendering contexts, or are responsible for creating new nodes (!). These would be much trickier to implement, and we’ll need to revisit in the future.
There are going to be wrinkles and limitations that we’ll discover, and iteratively improve, over time. It’s a first big attempt at reconciliation ahead of locking in dbt-core interfaces later this year. Let us know what you think
What’s ahead
The next minor version of dbt Core, after v0.21, will not be v0.22 — it will be v1.0. That means:
- Specific changes to the ways you install dbt Core + adapter plugins
- More consistent, intuitive ways to use and interface with dbt-core
- Clarity about which pieces of dbt-core are “locked in,” and which things can change in minor versions post-v1.0
Excited? Questions? Stay tuned: there’s more coming soon, and to a Coalesce near you.