The problem I’m having
I cannot find documentation about dynamically selecting the grain of an aggregation step.
The context of why I’m trying to do this
I manage a pipeline that consumes raw health related data and produces an aggregated “gold-level” analytic file for researchers. It’s currently implemented in a custom dbt-like system, and I am investigating what it would take to migrate to dbt.
An essential part of the current system is that the analyst can say “I’d like the mark outliers within groups defined by dimension x, y, and z, and then, I’d like to calculate summary statistics for non-outliers over groups defined by x, y, z, and a”, where the sets of columns used in the various GROUP BYs depends on their analysis.
What I’ve already tried
I’m sure I’m being dumb, but I can’t find examples or docs where an input to the pipeline can be used to insert the columns that can be selected. Bonus points if there is some robust way to confirm that the requested columns will exist.