I like the idea of testing transformations on dummy data, but at first glance dtspec looks overly complicated to me.
I’ll give a potential implementation of a "poor man’s dtspec":
- Use seeds to create tables with some special prefix - for example
test_. For a source tablesource('jaffle_shop', 'orders'), the test table would be namedtest_jaffle_shop__orders. For a modelref(customers), the test table would be namedtest_customers. The idea is to use the naming convention to create a 1 to 1 relationship between sources/models and corresponding test tables. - Find all test seeds that refer to models - either by eliminating the ones that have two underscores
__in their name or by cross-referencing with thegraphvariable. For each such model:- Somehow temporarily and recursively make the
ref('model')macro resolve totest_modelandsource('source', 'table')resolve totest_source__table. - Run the generated SQL and make sure the output matches the contents of the corresponding test seed.
- Somehow temporarily and recursively make the
All this can be attained just by using seeds and with minimal configuration. So what’s the advantage of using actual dtspec?