Hey @joshtemple - we use testing in a few different capacities.
- During development
When creating or modifying a table, I like to add tests for some invariants, like the uniqueness or non-nullness of a given column. This helps me gain confidence that the code which I am writing works as intended. @kriselle makes a great point about TDD above too. For well-defined models, I like to write my tests first, then build a model around the tests. This flow helps reduce the friction around merging PRs… more on that below.
- During PRs
We’ve built build-on-PR functionality into sinter. Whenever a Pull Request is made against our internal-analytics repo, Sinter will kick off a full run & test of our dbt project in a scratch schema. If any of the models or tests fail, then Sinter puts a big red X on the Pull Request. If everything works as intended, then Sinter shows a nice green check mark. The contextualized test success/failures help guide code review, and acts as a last check before code gets merged into production. In practice, it looks like this:
- In production
We also use Sinter to run our models on a scheduled basis. A typical deployment of dbt includes running:
So, after every invocation of
dbt run, we also run all of the tests in the project. If any of these steps fail, we get email notifications from Sinter. This helps alerts us when bad/invalid data has arrived from a data source, or when some logic in a model doesn’t hold up over time.
There’s lots more to say about testing, deployment, CI, etc. I know many dbt users are doing pretty interesting/complex things. You should cross-post this to #general in Slack