Hello all!
I wanted to take some time to describe the small testing framework we’ve put together for our dbt
project.
The main components are:
- source schemas are loaded from
sql
files (test/schemas/SCHEMA.sql
) - source data are loaded from
yaml
files (test/fixtures/SCHEMA/TABLE.yml
) - the target data warehouse is built into postgres
- we run a handful of unit tests in ruby
-
we capture snapshots of the resulting target data to
yaml
files that we check in git (test/snapshots/SCHEMA/TABLE.yml
).
Capturing snapshots of the resulting data warehouse has sped up the development and code review process quite a lot and made us much more confident when making changes. We can verify that the code change is having the expected effect on the data warehouse and does not have any unintended side effects.
Here is for example a diff when changing the datatype of a column and adding new columns:
I’m happy to go into more detail if you have any questions or comments!