Tracking Transformation Summaries and Data Quality in dbt

When I group or perform certain transformations from one table to another, is it possible to get a summary of that? For example, when I run dbt run and I have a view A with 1 million observations in the intermediate layer, and I create a table in the mart layer that groups view A by certain features, resulting in a table with 10k observations, I’d like to see:

  • The size reduction in observations: from 1 million to 10k.
  • The number and percentage of null values for certain features.

Would it be possible to see this in the DAG visualization? If not, how would you implement this? I want to use these summaries as a quick general overview of how the data is being processed and transformed across the layers.

There is dbt-artifacts project that can help you with project execution.
Also you an have a look at Data Quality Testing: Ways to Test Data Validity and Accuracy (lakefs.io). One of tools mentioned is dbt-expectations. This tool can help you detect nulls percentage and if you look at compiled code you may come up with your own data quality tables.