Hi, I was wondering how to factorize SQL of very similar models?
Would you suggest to write the generic model as a macro with multiple parameters and then using it for each model? Or do you suggest to build each model independently without any common macro?
The first option allows to deduplicate the code, reduces the code base, allows propagation of a modification etc… But the generic model as a macro cannot be tested, so very bad point…
What do you think of this practice?
The answer to this question comes down to maintainability. If you have a sufficiently large number of models that are all “similar enough” that a macro with inputs could cover them, and maintaining that macro is easier, I say go for it! Write less code!
I would also question your second point, that
“the generic model as a macro cannot be tested”
As far as I am aware, models created via macro calls are just as testable as those written by hand via column and table level configurations. I’ve also found that using a script in the /analysis folder is a great way to manually test macro outputs, as well as the run-operation command.
One thing that CAN be a pain about testing models who’s source code is generated from macros is maintaining a configuration file for each. One thing I’ve found helpful is creating and separate macro which is specifically designed to generate the testing-configuration yml code for those models and maintaining just those two things instead of potentially 100s of .sql and .yml files.
Hope this helps!
Thanks for your answers, these are all very good options.
Still I feel the proper way to do would be testing the output of the macro rather than all the downstream models, which could be aggregations, so hard to test.
Have you found a way to test the output of a macro with
dbt test rather than a manual test?
I guess that would be great if DBT could perform a
dbt test on the
An other solution could be to create ephemeral models just for testing purpose, what do you think?