Since dbt_utils.star
doesn’t work on CTEs, I’ve made a wee macro which solves my most common need: pulling all but a couple of columns out of a CTE in the same model file.
To use it, you need to set the columns in your select
statement to a variable. Then you can pass it into the CTE as normal, but also pass it into the star_cte
macro, which will split on commas (except in brackets) and then take the last element after splitting on spaces to get the column’s name/alias.
A super-contrived example:
with organisations as (select * from {{ ref('organisations') }}),
transformed as (
select
{%- set select_columns = "
org_id,
'a' b,
dateadd(month, 1, getdate()) as some_date,
region_name
"-%}
{{select_columns}}
from organisations
),
final as (
select {{ star_cte(select_columns, 'transformed', except=['org_id', 'b']) }}, count(*)
from transformed
group by 1, 2
)
select *
from final
renders to:
with organisations as (select * from "reporting"."dev_jlabes"."organisations"),
transformed as (
select
org_id,
'a' b,
dateadd(month, 1, getdate()) as some_date,
region_name
from organisations
),
final as (
select
transformed.some_date,
transformed.region_name, count(*)
from transformed
group by 1, 2
)
select *
from final
The macro itself is pretty brief:
{%- macro star_cte(select_columns, cte_name = '', except=[]) -%}
{%- set include_cols = [] -%}
{%- set nsp = namespace(is_in_brackets = False) %} {# "Move the variable's scope to be global" #}
{%- for col in select_columns.split(',') -%}
{%- if ')' in col %} {# "Argments to functions shouldn't be treated as column names" #}
{%- set nsp.is_in_brackets = False %}
{%- elif '(' in col %}
{%- set nsp.is_in_brackets = True %}
{%- endif %}
{%- set col_tidied = col.strip().split(' ') | last -%}
{%- if col_tidied not in except and not nsp.is_in_brackets -%}
{%- set _ = include_cols.append(col_tidied) -%}
{%- endif -%}
{%- endfor -%}
{% for col in include_cols %}
{% if cte_name | length > 0 %}
{{- cte_name -}} .
{%- endif -%}
{{- col -}}
{{- ", " if not loop.last }}
{%- endfor -%}
{%- endmacro -%}
It doesn’t help if you’re using ephemeral models upstream, but that use case doesn’t apply to me. This just helps DRY up an otherwise-messy piece of our models.