[Prerelease] dbt Core v1.0.0-b1

A few weeks ago, I promised that dbt Core v1.0 is on the horizon.

On October 11, we released dbt-core v1.0.0-b1. We’re planning to put out several more betas and a release candidate over the next six weeks, with plans for a final release in early December.

This post will be a combination of fine print and feature preview. I’ll also be updating it many times over the next six weeks. For now, I welcome you to:

  • Try out the beta
  • Read prerelease docs
  • Join #dbt-v1-readiness in dbt Slack, and get your project ready by upgrading now to the latest stable version (v0.21.0)
  • Grab your spot at Coalesce, when we’ll be cutting the v1.0 ribbon :slight_smile:

Installation

We’re taking the opportunity to rework our release processes. Right now, v1 beta releases of dbt-core, dbt-postgres, dbt-redshift, dbt-snowflake, and dbt-bigquery are available on PyPi. To install your specific adapter, including dbt-core and all dependencies:

pip install dbt-<adapter> --pre --upgrade

Coming soon:

  • dbt-spark v1.0.0b1
  • Homebrew formulae + Docker images for all of the above
  • Availability in dbt Cloud

Renaming

We’ve just renamed a repository, from dbt to dbt-core, and we’ve updated the logo in its README. Why?

Five years ago, the name dbt referred to a pretty particular thing: a handy command-line tool that made it much easier to create views in Postgres, by storing their definitions in version control and promising to run them in the right order.

Today, dbt refers to a lot more things: a community of practice, a commercial software product, a fast-growing company, a burgeoning package ecosystem, a way of writing SQL, a way of viewing and thinking about analytics problems.

To that end, we’re going to start saying “dbt Core” when we mean dbt-core—that is, the foundational open source software at the heart of it all. The goal here is clarity, and also pride. This is a huge milestone for dbt Core, and we’re going to feel its ripple effects all over the place, across the wide smorgasbord that is dbt in 2021.

Plugins

Our open source plugins for Redshift, Snowflake, and BigQuery now live in their own repositories: dbt-redshift, dbt-snowflake, and dbt-bigquery. Those are the places you should go to report bugs, suggest features, and contribute code that’s specific to each. In fact, we believe this change will make contributing easier than ever! If you depend on or care deeply about one or more, I welcome you to star and watch those repositories.

The dbt-postgres adapter plugin will continue to cohabitate with dbt-core, in the dbt-core repo. There’s a practical reason for this: We use Postgres pretty extensively for core testing and local development workflows today. Eventually, we plan to move it into a separate repo, too. For the time being, it will remain a bit of an exception—though not, I hope, a big source of confusion.

After v1.0, dbt-core will not make breaking changes to adapter interfaces in patch releases. As such, Labs-supported adapter plugins will start declaring compatibility dependencies (~=) on minor versions of dbt-core, and we invite all other database adapters to do the same. This makes it much easier to release and use new patch versions, as soon as we have fixes ready. We’ll still coordinate around new features and interface changes (if any) for all new minor versions.

The code for the dbt RPC server also lives in its own repository: dbt-rpc. The RPC server started as an experiment in possibilities of interactive dbt development, and it’s proven the value of that proposition, serving as the beating heart behind the dbt Cloud IDE. At the same time, we’re convinced that we need to build a more robust, scalable dbt Server. Stay tuned for more details. In the meantime, we’re going to keep maintaining dbt-rpc functionality, but we won’t be including it in dbt Core v1.

Housekeeping

While we were in the renaming spirit, we also updated the dbt-core default branch from develop to main. There was no really good reason to go against established convention here, just old habits. If you have a local clone of the repo, you’ll just need to make a quick update before your next contribution :wink:

Last but not least: We’re adding a stale bot to the dbt-core repo. It will automatically tag any issues that have had no updates for 180 days, and close them (if still no updates) one week later. Our intention here is not to ignore any issue as soon as it’s old, but rather to make the repository a more manageable and accessible place for everyone Many of the most compelling ideas have been around for a while, and we reserve the right to re-open them at any time.

Notable changes (so far!)

v1 is more than just a reorg—it’s a new version of dbt Core, gosh darn it! There are a handful of features already in b1, with more to come over the next several weeks.

Performance

In v0.20, back in July, we introduced a top-to-bottom rework of partial parsing, and a brand-new static parser for many models. In v1.0, we’re turning on both, for everyone, by default.

When all is said and done, compared to v0.19.0 (released in January), we believe dbt Core v1.0.0 will offer a 100x faster development experience in very large projects—that is, a 100x speed-up when reading files, identifying changes, updating an internal manifest, and kicking off queries.

We hope v1.0 feels speedy right out of the gate. Thank you for all of your help, patience, and feedback this year as we made performance a top priority.

Global configs

Previously, some runtime configs could be set via flags, some via env vars, and even some in profiles.yml. What gives?

All global configs can now be set in one of three ways: the config block in profiles.yml, an environment variable named DBT_<GLOBAL_CONFIG>, and a CLI flag named --<global-config>. That’s the precedence order, too: CLI flag overrides env var overrides user config.

Even more renaming

Tests have been renamed, once and for all:

  • schema tests are now generic tests
  • data tests are now bespoke singular tests

That’s really it. Tests are more alike than they are different; ultimately, the two test types are just two points of entry into the same functionality. It’s all up to you and your use case.

We also renamed a handful of behind-the-scenes configs in dbt_project.yml, many of which are long-overdue:

  • source-paths is now model-paths. It’s the place you create models.
  • data-paths (default data/) is now seed-paths (default seeds/). It’s the place you create seeds.
  • modules-path (default dbt_modules) is now packages-install-path (default dbt_packages). It’s the place you install packages.

These aren’t breaking changes—we’ve got backwards compatibility for the old names—you’ll just see a deprecation warning or two right after upgrading. This is a one-minute switch, set, & forget. Most important: all new users, starting with v1.0+, will never need to know the difference.

4 Likes