Hi,
Just wondering if things changed in the last year when it comes to tests and best practices. I would like to add simple, or more complex tests, to a model and configure it so that the model doesn’t get created if it doesn’t pass the test?
Hi,
Just wondering if things changed in the last year when it comes to tests and best practices. I would like to add simple, or more complex tests, to a model and configure it so that the model doesn’t get created if it doesn’t pass the test?
Oooh… I love this question, @moseleyi ! I’d also appreciate knowing if best practices have changed over time.
There are several ways I can think of.
Option 1:
You can create a dummy dependency on the model, then put the test on that dependency. If the test fails, nothing downstream would run. Mention that dependency in a SQL comment.
-- this is a sql comment. {{ref('my_thing')}} will still compile and show in the dag. But it won't get executed in the database
{# this is a jinja comment. nothing would compile #}
The only shortcoming of this approach is that it only works when you build the upstream stuff as well. If you only build the one target model, the test would not build.
Approach 2
use the dbt run_query
macro (About run_query macro | dbt Developer Hub) at the top of your model to test some result. Load the result to a variable and evaluate pass/fail. Then do something like this:
{% if execute %}
{% if my_test_has_a_failed_condition %}
{{ exceptions.raise_compiler_error("The test failed.") }}
{% endif %}
{% endif %}
Basically dbt will stop running at that point.
There are a few other approaches I can think of, but option 1 is the most “dbtoncic”… so I would start there.
Ahh yes, I guess the approach #1 is basically to create an intermediate layer on which we run tests to stop the models used by BI/people
Yes. I assumed this model had no existing upstream dependencies, which is why I suggested a dummy upstream model. But since you are talking about models consumed by BI, certainly there are upstream models where the test can be applied, right.
And don’t know why I didn’t think of this before, but you could just put the custom test on the source if it is the first model in the chain. So really there is no scenario where you need a dummy model with the comment approach. Just regular old dbt testing. Test the thing that is upstream, source or model.
I’m creating a new reply so this thread has a super clear solution for future folks who stumble on it.
The correct way to do this in dbt is to put the test on the upstream object, source or model.
If you need it to work when the model is run in isolation and not just when the entire DAG is run, then a more complicated approach needs to be taken.
But best practice would say solve at the simplest and most standard way.
Please mark this is the solution if it helped you.
dbt is adding a feature called “unit tests”, which solves your question in a new way. Unit Tests will run before the creation of a model, which is what you are looking for.
It should be released in dbt 1.8, around April 2024 timeframe.
Cheers!
Interesting… and definitely worth the wait!