Test only new data in incremental models


I am working at a pretty big retail company in Sweden. We are handling huge amounts of data. Data Quality is of course an important aspect to our company and our users of our reports. We are using dbt together with BigQuery.

We have a lot of incremental models, but not all of them are built upon partitioned tables for different reasons. Is there a way to create tests, that can only scan the newly added/updated data in incremental models, instead of always doing a full scan against that model?

I would love to have something like this:

{{ if full_refresh }}

do full scan test

{{ if is_incremental }}

test only new / updated data.

Maybe you could create a column in your model to inform when the data was transformed

And then you could use this column and the where clause in tests (where | dbt Developer Hub) to only test recent data based on this column

I suppose you could use a flag within your test code:

{% if flags.FULL_REFRESH %}

Yes, but how do I tell the test to only be run on the newly added/updated data. Will that be the case out of the box, or do I have to add logic as Bruno states above. Where a column has to be added based on if data has been changed or not.

That is great idea. How can this logic for that column look like?

What I do in my models is I add

current_timestamp() as updated_at

column, then in test for new data, filter for current day or last 2 days, depends on how often do you run tests.

But without partitioning, this will not save you any costs I suppose. Maybe time of execution will be lower.