Creating Dynamic Queries with Jinja and model.yml files

Problem Context:

We have various fields that we do/don’t want to surface to end users in Looker for any given transformation. However, the fields are necessary for transformation logic. In a gross sense, our query relationships goes stagingtransformationsurface only selected fields to visualization tools ('Clean'). However, manually maintaining our Clean queries is growing into a bit of a pain, and we’ve run into some frustrating situations doing so. A key part of this is that we don’t want to actually maintain the SQL in the Clean queries.
We also have limited resources to implement this from a BigQuery or Infra side of things, so we had to do this only with what DBT has to offer.

Solution

We have added a property to every column entry, that we can read via Jinja. This greatly simplifies things, as we already document each column, both as a matter of good practice, and to centralize documentation in DBT–all of our descriptions are written to BQ, and Looker now has the capability to import the descriptions into LookML when creating view files. Part of the design of this is to operate on an allow-list basis–the column must have the property, and the property must = true. At some point, a column must be defined as being allowed/not allowed, and we decided that dbt is right place for us, given our constraints.

I had avoided using the tags hierarchy, as the .yml files were getting pretty big.

An example of the model.yml:

version: 2

models:
  - name: upstream_query
    description: "blablabla"
    columns:
    - name: field_1
      description: "blablabla"
      is_allowed: true
    - name: field_2
      description: "blablabla"
      is_allowed: false
    - name: field_3
      description: "blablabla"
      is_allowed: true

The macro:

{% macro get_allowed_columns_macro(tablename) -%}
	{% if execute %}
		{%- set column_list = [] -%}
		{%- for node in graph.nodes.values() -%}
			{%- if node.name == tablename -%}
				{%- for column, properties in node.columns.items() -%}
					{%- do column_list.append(column) if properties.get('is_allowed') == true -%}
				{%- endfor -%}
			{%- endif -%}
		{%- endfor %}
		{{ return(column_list) }}
	{% endif %}
{%- endmacro %}

An example query:

/* Define the table you are referring to */
{% set tableref = 'upstream_query' -%}
/* Create a [list] of columns that are flagged as field_name == true */
{% set allowed_columns = get_allowed_columns_macro(tableref) -%}

SELECT
{%- for columns in allowed_columns -%}
{%- if not loop.first %}
, {{ columns }}
{%- else %}
{{ columns }}
{%- endif -%}
{%- endfor %}

FROM {{ ref(tableref) }}

Which will compile to:

SELECT
field_1
, field_3

FROM `database.dataset.upstream_query`
1 Like

Hey ! is it possible to do the other way around ?
I mean using the columns in the model.sql then allowing them in the model.yml ?