Release: dbt v0.16.0

dbt v0.16.0 - Barbara Gittings

:bell: Who is Barbara Gittings? Check out the release notes for a biography of this famous Philadelphian :bell:

dbt v0.16.0 overhauls dbt’s compilation contexts to make compilation more consistent, improves performance, and provides a whole bunch of highly requested functionality and helpful bugfixes :tada:.

There are some breaking changes to be aware of in this release. Note: most projects will not be impacted by these changes, but please read them carefully in case any apply to your usage of dbt!

Breaking changes

  • Quirks with type inference for seed CSV files have been fixed, but may change the data loaded by the dbt seed process for your project in subtle ways.
  • BigQuery range bucket partitioning must now be configured with the new-style partitioning config
  • Support for the one-argument variant of generate_schema_name has been dropped
  • Files with a .yml extension found in the data/ , macros/ , analysis/ , tests/ , and snapshots/ directories will now be parsed as schema.yml specifications
  • The accepted arguments of the get_catalog macro have changed
  • The signature of the snowflake__list_schemas macro has changed
  • dbt no longer supports building models in Snowflake databases with greater than 10,000 schemas
  • Arguments to source schema test arguments were previously parsed in an inconsistent way, but they are now parsed in the same way as arguments to model schema tests
  • The timestamp present in debug log lines is now rendered in a more standard format
  • The docrefs key has been removed from the manifest.json file

For a full list of changes in this release, please consult the release notes.


Installation notes

# With Homebrew
brew install dbt@0.16.0
brew link --overwrite dbt@0.16.0

# Or with pip
pip install --upgrade dbt==0.16.0

Some selected highlights from the changelog to follow:

A compilation context for dbt_project.yml :computer:

The models:, snapshots:, and seeds: configs in the dbt_project.yml file are now evaluated using a “base” compilation context. This means that you can reference variables, env vars, the selected target, and other variables when configuring resources in your project. Here’s a quick example to give you an idea of what’s possible:

name: my_project
version: 1.0.0

# Configure models in the `models/marts` directory to build
# as tables in prod, or views in dev/CI/etc

models:
  my_project:
    marts:
      materialized: "{{ 'table' if target.name == 'prod' else 'view' }}"

For more information on the dbt_project.yml compilation context, check out the docs.

Document everything :open_book:

Documentation can now be provided for:

  • analyses
  • custom data tests
  • macros
  • seeds
  • snapshots

These resources can be configured in schema.yml files in all of the places you would expect: macros/, data/, snapshots/, analysis/, and tests/ directories. Check out the docs on the schema.yml syntax for more information on documenting these resources, as well as usage information for some new documentation-oriented configs.

Some quick highlights:

  • Resources can be hidden from the rendered documentation site using the docs config
  • Metadata can be provided for models using the meta config
  • Columns and column tests can be configured with tags using the tags config. These tags can be used to select specific tests to include or exclude using --models and --exclude selectors

BigQuery incremental model improvements :racing_car: :tornado:

This one was a team effort – major shout outs are in order for everyone who contributed in the issue and the Pull Request (seriously - if you want to see open source development in action, check these out!).

dbt v0.16.0 ships with the ability to configure the incremental_strategy for BigQuery incremental models. Check out @jerco’s posts on using the incremental strategy config and benchmarking incremental performance for more information on how to use this powerful new feature.

Generating database names

With the addition of the generate_database_name macro, the triumvirate of generate_*_name macros is now complete. In addition to dynamically generating the name of model aliases and schemas, the database that models are rendered into can now be configured with a macro. Check out the example below which renders models into a single database in dev and CI, but spreads models across different databases in prod:

{% macro generate_database_name(custom_database_name=none, node=none) -%}

    {%- set default_database = target.database -%}
    {%- if custom_database_name is none or target.name != 'prod' -%}

        {{ default_database }}

    {%- else -%}

        {{ custom_database_name | trim }}

    {%- endif -%}

{%- endmacro %}

Use it with:

-- models/my_model.sql

{{ config(database='marketing') }}

select *
from ....

Performance improvements :fast_forward:

The following actions should feel noticeably faster, with performance lifts varying by database:

  • Time to start running models (most noticeable on Snowflake and BigQuery)
  • Time to generate docs with dbt docs generate

These speed improvements are a function of 1) using smarter queries to fetch data from the information schema and 2) parallelizing queries to the information schema. Future releases of dbt will expand on the approach implemented in this release.

Thanks to these contributors! :muscle:

If you’re interested in working on a feature in the dbt backlog, check out the Contributing Guide and drop us a line on Slack! Thanks to the following contributors who submitted PRs for the 0.16.0 release :tada:

1 Like