[Jan 14] v0.19.0-rc2 is available. It includes a fix and a few under-the-hood changes on top of RC1.
[Jan 05] v0.19.0-rc1 is available for prerelease testing.
Below, I’ll give an overview of the biggest changes since v0.18. I’d also encourage you to read:
- Changelog for the full set of features, fixes, and under-the-hood tweaks
- Prerelease docs for an overview of new and changed documentation
Give this RC a spin, and let us know what you find by responding below and posting in the #prereleases channel. Barring any show-stopping bugs, we expect to release the final version in two weeks’ time.
# with pip pip install --upgrade dbt==0.19.0rc1 # with homebrew brew install email@example.com brew link --overwrite firstname.lastname@example.org
Gently breaking changes
We don’t expect these to require action in most projects.
We’ve made changes to all JSON artifacts that dbt produces, starting with the addition of a
metadata dictionary. For the first time, we are version-controlling and documenting them in detail. A full JSONSchema of each versioned artifact will always be available in at schemas.getdbt.com. Check out each new
Older JSONSchemas of the four artifacts (as of v0.18.1) are hosted at the same site as
v0. Note that these are not official versions, but they may be helpful if you need to migrate existing code.
Why do this now? Artifacts have become increasingly important: they power new dbt features (such as Slim CI), and enable integrations with the wider data ecosystem. By establishing these contracts now, we want you to feel confident that wraparound workflows will not break at a moment’s notice. So, go for it: calculate documentation coverage from the manifest, identify bottlenecking models within the run results, track table size via the catalog. If you can parse JSON, you can do it.
Update from v0.18: Slim CI (docs)
The introduction of two beta features in dbt v0.18.0,
--defer, enabled a powerful new workflow in CI: build only the models that have changed since the last prod run (
state:modified), and save time by selecting from their unmodified parents in prod (
dbt v0.19.0 includes two substantive changes that make Slim CI even better:
state:modified: dbt now stores the unrendered version of Jinja expressions used to set configs in
dbt_project.yml. If you have expressions that return different results based on the
target, dbt previously marked those as modifications. Now, it’s a little bit smarter at detecting what’s a real change as the result of development.
Subtle tweak to
--defer. Previously, this worked as a binary: either you were running a model, or you were referencing it from the state manifest. This was simple as an initial implementation, with its fair share of edge cases. We’ve dialed back deferral to work instead as a “fallback” mechanism: If you need to select from a model, and it doesn’t exist in your schema, dbt will instead look for it in the other manifest’s namespace. If it does exist in your schema, great! No need to defer.
This subtle change fixes edge cases around seeds and model re-runs. It also enables us to support deferral for tests, too, which should lighten the burden on complex node selection logic in CI job definitions.
What’s the drawback? You could use
--deferas a way to reliably a downstream model in your dev or CI schema, while reliably selecting from production references. Now, if those references do exist in your scratch schema, dbt will use them instead. You can simply drop them (or drop and recreate your schema) to replicate the original behavior.
All in all, Slim CI is more powerful, better documented, and more intuitive. For now, it is still a preview feature in dbt Cloud—contact support if you’re interested.
- After being deprecated in v0.17.0,
config-version: 1specifications of
dbt_project.ymlare no longer supported. See the v0.17.0 migration guide for details.
Notable non-breaking changes
Snapshots now offer first-class support for capturing hard-deleted records via an optional config,
invalidate_hard_deletes. If a unique key disappears from the snapshot query, the snapshot will update
dbt_valid_to; if it reappears, the snapshot will add a new record.
YAML selectors now support a
descriptionattribute, and they appear in
manifest.json. (Support in the dbt-docs DAG viz coming soon.)
repython module is now available to Jinja templating code—within macros, models, wherever—enabling much more complex regex logic. (Of course, if you need regular expressions for data transformation, use SQL!)
Some BigQuery-specific additions:
- Partitioning tables by hour, month, or year via a
- New token-based connection methods support OAuth in dbt Cloud and other deployments.
gcloud) will use your default configured project, instead of raising an error, if none is specified in
profiles.yml. This gives dbt-bigquery the distinction of having the most concise profile possible:
my-bigquery: outputs: dev: type: bigquery method: oauth dataset: dev_jerco target: dev
get_columns_in_relationperforms better with external tables than in v0.18.1, thanks to a one-line fix
- Postgres model names can be ≤51 characters long (up from 34) without fear of silent truncation
- You can use
docblocks inside of exposure
Under the hood
- Updated dependencies for Google and Snowflake libraries
Unofficial support for Python 3.9.
dbt-coreand most plugins can run in py39 environments. (
dbt-snowflakecannot.) We’ll declare official support in a future release, once all plugin dependencies are compatible.
We know that dbt takes too long to parse big projects today. The “dead time” between typing
dbt run and seeing the first model execute is especially painful because there’s no way to get around it: you experience it whether you’ve selected to run one model or a thousand.
The v0.19.0 release introduces a new command,
dbt parse, that will parse your project and produce a file with detailed timing info (
target/perf_info.json). We’re planning to follow up soon with a v0.19 performance release: changes that will, we believe, reduce parse time by half in large projects.
That’s just a starting point. We’ll be devoting significant time and energy in 2021 to rewriting the slowest parts of dbt from the ground up. In the long run, we want all projects to parse in seconds, not minutes. If you’re interested in early access to alpha and beta versions of performance releases, send me a message—we’d love to have your help.
The next minor version will be all about tests. Check out:
- Initial milestone in GitHub
- Recent Discourse thread about current capabilities and constraints for testing in dbt
In the process, we’re hoping to resolve some inconsistencies that should get us well on our way to v1.0 later this year. Happy 2021