Release: dbt v0.17.0

:bell: Who is Octavius Catto? Check out the release notes for a biography of this famous Philadelphian :bell:

dbt v0.17.0 introduces a new, more stable version of dbt_project.yml, with positive implications for variable scoping and Jinja-YAML rendering. This release also includes some much-requested features and bugfixes. Note that all projects will be impacted by these changes.

Breaking Changes

  • dbt_project.yml has a new format, reflected by config-version: 2. All version-1 configs emit a deprecation warning, and support will be removed in a future dbt version.
  • The list_relations_without_caching, drop_schema, and create_schema macros and methods now accept a single argument of a Relation object with no identifier field.
  • The graph object available in some dbt contexts has changed. Sources have been removed from nodes and stored in a new member object sources.
  • The ‘location’ field has been removed from BigQuery catalogs. This should avoid permission issues encountered by many users of dbt on BigQuery when generating their project documentation site.

For the full set of changes implemented since 0.16, please consult the release notes and migration guide.


Installation notes

# With Homebrew
brew install dbt@0.17.0
brew link --overwrite dbt@0.17.0

# Or with pip
pip install --upgrade dbt==0.17.0

Some selected highlights from the changelog:

New dbt_project.yml

The goals of our new dbt_project.yml are threefold:

  • Avoid ambiguity between configs and folder path names. Previously, if you had a model subdirectory named materialized, some surprising things would happen. Now, we differentiate between a subdirectory named materialized and a model config +materialized.
  • Move vars scope to the project level. Variables must have the same default value for all models within a package. At the same time, it is now possible to use more data types for variables in installed packages.
  • Every resource type is now configurable as such. You’re likely used to configuring models from dbt_project.yml from within a models: block. The same has been possible for seeds: and snapshots:; it is now also possible for sources:.

When upgrading, you’ll need to add one line to dbt_project.yml in existing projects:

name: my_project
version: 1.0.0

config-version: 2    # this line is new!

vars:                # this goes here now
  beginning_of_modernity: '1910-12-01'
  my_installed_package:
    include_bonus_models: true

models:
  my_project:
    marts:
      +materialized: table
      +tags: ["important", "daily"]
    staging:
      +schema: stg

For more information on the dbt_project.yml compilation context, check out the migration guide.

Faster Snowflake metadata queries

At the beginning of runs, dbt caches information about objects already in the database. Previously, the select * from information_schema queries used to list objects on Snowflake felt painfully slow. There were two reasons:

  • Any select query, even from the information_schema, requires an active warehouse and will queue behind other active-warehouse queries.
  • The previous query included an ilike to handle inconsistencies around casing and quoting. This required scanning much more data than strictly necessary, and the runtime of this query seemed to scale with the number of objects in the logical database.

We’ve now replaced all information_schema queries with show and describe statements that do not require an active warehouse to run and perform equality lookups only. We hope your Snowflake runs feel snappier!

Persisting docs

On all core plugins, you can now persist model and column descriptions as comments in the database.

To persist all descriptions for all models:

models:
  my_project:
    +persist_docs:
       relation: true
       columns: true

To persist only column descriptions for one model:

{{ config(
    persist_docs = {'columns': true}
) }}

Persisting model descriptions has been available for BigQuery users since 0.14.0. Since then, a number of community members have expressed their appreciation for this feature.

In the future, we may add support for persisting comments on schemas/datasets and databases/projects as well.

Fail faster

When you’re running a series of models or tests, you can supply a new --fail-fast flag to stop the run on the very first error or failure:

Encountered an error:
FailFast Error in model stg_stripe_charges (models/stg_stripe_charges.sql)
  Failing early due to test failure or runtime error

This can be handy if you’re:

  • Making changes to several models at once in development. Spot syntax errors quickly without waiting for an entire build.
  • Building out a process for blue-green deployment. Fail quickly and roll back gracefully!

Behind the scenes

  • Native rendering for Jinja variables. This works hand-in-hand with the changes to vars scope to give users greater power when configuring installed packages.
  • Snapshots are now unified and atomic operations, executed in Snowflake and BigQuery with a single merge statement.
  • Ever-improving support for community-supported plugins:
    • dbt --version includes the versions of installed plugins in its output.
    • Schema tests alias column names to support dbt-sqlserver.
    • dbt prints statuses using the adapter plugin’s idea of how to display relations. This support adapters (e.g. Spark) which use different namespaces from database.schema.identifier.

Thanks to our contributors!

We had 18 individual contributors for this release—the greatest number yet! If you’re interested in working on a feature in the dbt backlog, read through the Contributing Guide, check out the latest good first issues, and drop us a line on Slack.

Thanks to the following contributors who submitted PRs for the 0.17.0 release:

8 Likes

These are just stellar release notes. :clap: We’re a little behind on upgrading and using dev versions to help contribute, but I’m working on getting us up to speed in this area.

Thanks to all contributors, and especially @jerco for that bio of Octavius Catto. :heart: