Schema tests - Disabling SELECT * in Column tests for a model

The problem I’m having

I’m working on a project that uses dbt on BigQuery. We have multiple models and have a bunch of schema tests for our models. Something that I noticed was that schema tests will try to pull in all the columns even though the test is specific for a column. I’m assuming this has to do with dbt storing all the associated columns with the rows that might fail the test.

The context of why I’m trying to do this

Some models might have a lot of columns and the associated cost with pulling in all columns to just test a single column is unnecessary and users should have the option to disable this. But I’m not sure if there exists a way to disable this

What I’ve already tried

Looked through documentation, found something around storing failures (you can set it as false and avoid storing failures) - [store_failures | dbt Developer Hub] but I’m assuming that even after this there is a select * for the schema test.

Some example code or error messages

A sample model and it’s schema test for sample_column

  - name: sample_model
    columns:
      - name: sample_column
        tests:
          - not_null

The above schema test would compile down to the following:

select *
from `project`.`dev_staging`.`sample_model`
where sample_column is null 

Ideally if you don’t care about storing results:

select sample_column
from `project`.`dev_staging`.`sample_model`
where sample_column is null 

Why would you want to store failures but not want any identifying values within the stored data?
storing:

would just give you a table with a whole bunch of NULLs. it wouldn’t be helpful in identifying the problem data, so might as well just do store_failures: false and not store anything.
Maybe I’m misunderstanding your question… ?

@alyk thank you for pointing it out, I have edited my initial thoughts to be a little more specific. I was thinking that setting store _failures as false would still use a SELECT * underneath but from compiling my local project it does seem like it switches to just using the column if it’s set as false.

This is correct - this behaviour was added in [CT-271] [Feature] not_null test selects column instead of * by willbowditch · Pull Request #4777 · dbt-labs/dbt-core · GitHub

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.