State of testing in dbt

Hi all !

As a former software engineer, I met something called “design by contract”. The idea, proposed by Bertrand Meyer (another French citizen, sorry ;-)), implemented in the Eiffel language, is taken from Hoare’s triples, and I wonder how it could be used to validate a model. I’d like to take a little time to explain myself, and ask you to question the idea heartly !

Folks in the Eiffel world use this every day, and this reduces heavily the need for unit tests that can only test for presence of an error (In fact, there is even a way to automatically produce unit tests based on the notion below).

The idea to test for correctness, instead of looking for defects, could be translated in our dbt world like this :
given :

  • a model M, implementation of a data transformation referencing zero or more direct upstream models as input
  • a list of tests (data tests, schema tests on the direct upstream models) known as M’s precondition
  • a list of tests (data tests, schema tests on M), known as M’s postcondition

M correctness can be express like this :

Any run of the model started in a state where its precondition holds will end up in a state where its postcondition holds, something like dbt test_precondition -m M; dbt run -m M; dbt test_postcondition -m M

It’s quite intuitive, think about a transformation where a numeric column of a upstream model must be non null and positive for example, because M in its transformation computes its square root.

Then, to prove correctness of all/part of your transformation chain, you only have to run it with precondition and/or postcondition enabled on all/a subset of the models. If dbt ends up correctly, that’s all correct.

What do you think of this idea ?

With these glasses on, one could already see schema tests being already :

  • part of the postcondition of the model (as it constrains the underlying transformation)
  • part of the precondition of all direct downstream models based on the model (as direct downstream transformations may rely on M schema tests).

So one could envision :

New model properties :

  • require : precondition tests, pretty like tests at model level, but with the ability to also list data tests
  • ensure (could remain tests ) : postcondition tests, idem, but at model or column level.
    In fact, may a direct upstream model have no schema tests, on could even add tests at column level in the M require block for needed upstream model columns.

New model configuration :

  • assertion : an array of possibly 'require', 'ensure' values, used to configure level of correctness for that model (will test_pre or/and test_post be executed ?).
    Thinking out loud, this could also be useful in jinja code, to assert that some condition holds at a given time in a macro’s code. One could envision a macro like check(boolean). In that case one could add a extra 'check' value to the list, used to enable assertions in jinja.

So yes, it’s all goes down to a kind of intertwined dbt run + test at model level !
And to follow on @jerco 's comment, these tests are more than tests, they are contracts, first class citizens of a model’s metadata : “if you run me under these preconditions, I will ensure you these postconditions”

Of course, the new config above would be used to disable the tests in production.
To enforce eventual consistency, this mechanism should allow to defer relationships test at the end of the dbt run.

Do you think it could be valuable ?

In fact, this notion is so powerful that it could even be used to synchronize parallel executions of DAG parts (Eiffelists call that SCOOP).

Imagine you partition the DAG, assigning a processor/thread to each partition. model B and model A in two distinct partitions, affected to distinct execution threads. B’s precondition contains a data test T referencing A. If T is false, dbt should not fail, but instead wait until T realization, and then run B model and downstream models in the same B’s partition.

So dbt run could have an “infinite” behaviour mode : each processor doing an endless loop of its DAG partition, applying any wait condition at each model execution. Data tests used in a model’s require config that are referencing only same partition’s models would behave as standard require tests.

We could even add source freshness to the list of waiting conditions.

I am not a graph theory specialist, and have more experience in imperative than functional programming, so there are edge cases for sure…
and easier said than done, for sure…