@boxysean thank you so much for pulling together this excellent literature review. You’ve really nailed the capabilities and constraints of testing with dbt at the end of 2020, as they are and (even more importantly) as they’re understood to be.
I’m excited to watch two talks about testing + dbt over the next two days; and since I’m already overdue to responding to this, I thought I’d leave a few thoughts here now.
If you can write it in SQL, you can test it
As I see it today, dbt testing can be quite full-featured—the primitive tests can do a lot—so long as you’re willing to write the code for it. Data tests are a great tool for one-off assurances; you just need the SQL that selects the rows you’re sure shouldn’t be there. When those assurances need to become assertions, standardized and cross-applicable, rather than copy-pasting data tests and swapping out a line or two, it’s simple enough to DRY up that code, parametrize the inputs (model
, column_name
), and wrap it in a Jinja macro prefixed test__
. Write a bit more Jinja, and you can end up somewhere very fancy indeed:
- unit testing of one model’s logic via in-file fixtures, following Sean’s and Josh’s examples above
- mock data in seeds, supplied by an external library like faker, that stand in place of your real source data via a
source()
override, and enable full end-to-end DAG testing - leveraging inferential statistics to identify outliers or aberrations, via BigQuery ML + dbt_ml, or by building on a unified multilingual platform like Databricks
All of which is to say, the fact that dbt only ships with four schema tests never struck us as a limitation: those four are the starting points , and the code is simple enough that users can write many, many more. There are a dozen custom schema tests in dbt-utils. @claus wrote several dozen more in dbt-expectations, replicating many of the great ones in Great Expectations.
In this way and many others, dbt is a framework—a way of writing code, wrapped in boilerplate—more than it is a library. The same holds true for materializations, strategies, default macros, and the rest: the fact that a user can override, adapt, and extend built-in code to accomplish their unique use case is one of dbt’s biggest features.
Better built-ins
All of that said, I think there are a number of obvious, uncontroversial ways to make tests more accessible and powerful out-of-the-box. By virtue of their age, there are also a few things about tests that don’t make a lot of sense, and we should fix that. We want to make those changes soon, ahead of dbt v1.0.
Here are some of the things on my wishlist:
- Data tests should be configurable and documentable.
- Schema tests and data tests should both return a set of failing rows, rather than a numeric value count(*) for the former and a set of rows for the latter. Why? Because…
- dbt should make it easier to debug errant tests in development, by (optionally) writing the failing records to database tables.
- Users should have the option of setting warn and error severity via thresholds (absolute or a relative %), similar to source freshness.
- When a user creates a schema test, they should feel they are building a reusable asset, not just hacking together one more macro to maintain. Schema test blocks (!) should have optional configurations like description and human-friendly failure messages. Those assets are even more valuable when packaged up and open sourced.
- At the same time, you shouldn’t need a custom schema test to do something obvious, like add a
where
filter or alimit
. All schema tests should have the ability to apply those configs, right out of the box.
(Check the v0.20.0 milestone for detailed issues on most of the above.)
Lowering barriers
Will this solve everything? No, definitely not—it doesn’t even begin to touch on some of the coolest things Sean noted above. I agree first and foremost, however, that there is too high a barrier separating unique
+ not_null
from more advanced usage, and not enough scaffolding along the way.
I’d love to end up in a place where we add native support for mock data + unit testing, following an opinionated + thoughtful approach that reflects a community-driven consensus. If we’re going to get there, we need to focus on solidifying the simple stuff today: making the case that dbt testing is fully featured in its own right, and fully extensible in yours.
I want dbt developers to feel comfortable investing their time and energy in pushing the limits of tests. I know we have work to do here; I’m reminded of it every time I hear someone suggest that dbt’s built-in schema tests are fine as baby steps, on the way toward using a separate dedicated test framework. (Other frameworks, including GE, are great! If it’s not already clear, they inspire a lot of my thinking here.) I hope that the technical changes I listed above are a reasonable set of steps on the way there.
At the same time, I so agree with Sean’s suggestion that a big, big piece of this is less about technical change and more about a clearer narrative in the documentation:
- We should talk about schema test definitions more like models, snapshots, and seeds—things you expect to create—rather than just as things you expect to use, and quickly grow out of.
- We should clarify the building blocks for folks who want to build more complex systems. If you’re writing unit tests, should you make your equality comparisons using custom code? The widely used dbt_utils.equality() schema test? The beloved audit_helper.compare_relations() macro? Why not the adapter.get_rows_different_sql() adapter method, which is used in our adapter-testing suite?
- We should encourage more community-contributed tooling that reinforces a culture of testing, tightly integrated with models, such as @tnightengale’s new package that offers a way to enforce test + documentation coverage.
- We should tell a story about tests being expressive attributes of models, ones that can validate, document, and communicate their meaning. They are (dare I say it) rich pieces of metadata.
Wherever possible, we should have solid recommendations, use cases, and trade-offs. The most intrepid dbt developers have already done promising work here. I want to see many more community members taking measurable risks, pulling on some threads, and reporting back with their ideas and findings—without feeling like they’ve gone off the deep end. I promise, I’m right there with you.