Dynamically Defining the Grain of Aggregation

mrufsvold · May 2, 2025, 3:11pm

The problem I’m having

I cannot find documentation about dynamically selecting the grain of an aggregation step.

The context of why I’m trying to do this

I manage a pipeline that consumes raw health related data and produces an aggregated “gold-level” analytic file for researchers. It’s currently implemented in a custom dbt-like system, and I am investigating what it would take to migrate to dbt.

An essential part of the current system is that the analyst can say “I’d like the mark outliers within groups defined by dimension x, y, and z, and then, I’d like to calculate summary statistics for non-outliers over groups defined by x, y, z, and a”, where the sets of columns used in the various GROUP BYs depends on their analysis.

What I’ve already tried

I’m sure I’m being dumb, but I can’t find examples or docs where an input to the pipeline can be used to insert the columns that can be selected. Bonus points if there is some robust way to confirm that the requested columns will exist.

Topic		Replies	Views
Best practices - different time granularity from the same source Help best-practice , dbt-core	1	713	January 30, 2024
dbt macro/test that just executes a select star Help testing , redshift	0	397	April 17, 2024
Performance issues with incremental aggregation models accounting for late-changing facts. Help	1	73	December 2, 2024
Dynamic BigQuery in dbt Help jinja , bigquery	1	1097	September 13, 2023
how to get dynamic column list for a source table Help	1	1653	March 15, 2023

Dynamically Defining the Grain of Aggregation

The problem I’m having

The context of why I’m trying to do this

What I’ve already tried

Related topics