The problem I’m having
I’m working on a project that uses dbt on BigQuery. We have multiple models and have a bunch of schema tests for our models. Something that I noticed was that schema tests will try to pull in all the columns even though the test is specific for a column. I’m assuming this has to do with dbt storing all the associated columns with the rows that might fail the test.
The context of why I’m trying to do this
Some models might have a lot of columns and the associated cost with pulling in all columns to just test a single column is unnecessary and users should have the option to disable this. But I’m not sure if there exists a way to disable this
What I’ve already tried
Looked through documentation, found something around storing failures (you can set it as false and avoid storing failures) - [store_failures | dbt Developer Hub] but I’m assuming that even after this there is a select * for the schema test.
Some example code or error messages
A sample model and it’s schema test for sample_column
- name: sample_model
columns:
- name: sample_column
tests:
- not_null
The above schema test would compile down to the following:
select *
from `project`.`dev_staging`.`sample_model`
where sample_column is null
Ideally if you don’t care about storing results:
select sample_column
from `project`.`dev_staging`.`sample_model`
where sample_column is null