I wanted to take some time to describe the small testing framework we’ve put together for our
The main components are:
- source schemas are loaded from
- source data are loaded from
- the target data warehouse is built into postgres
- we run a handful of unit tests in ruby
we capture snapshots of the resulting target data to
yamlfiles that we check in git (
Capturing snapshots of the resulting data warehouse has sped up the development and code review process quite a lot and made us much more confident when making changes. We can verify that the code change is having the expected effect on the data warehouse and does not have any unintended side effects.
Here is for example a diff when changing the datatype of a column and adding new columns:
I’m happy to go into more detail if you have any questions or comments!