support for additive dbt labels

The problem I’m having

I want to specify default labels for all dbt models inside dbt_project.yml. Because there is no model context accessible from the file, I can not set “owner” label, which is stored in model’s metadata. If I want to specify only the owner label inside config or models.yml, labels are not additive and adding “owner” label overrides all default ones.
I temporarily solved this using post hooks and custom macro, but would really like to either:

  • be able to set all default labels in dbt_project.yml and access model’s meta
  • set default labels inside dbt_project.yml and only add new labels in configs or model.yml files, like I can do with tags and meta.

Hi!

I found a solution for this in the slack community.
But I adjusted it a bit:

Here is a macro that you store in the macro folder

{% macro add_table_label_last_updated(table_ref) -%}
    {%- if config.get("materialized") != "view" -%}
    {% set tags = config.get("tags") %}
    {% set meta = config.get("meta") %}
    {% set re = modules.re %}
    {% set label_pattern = '[^a-z0-9_-]' %}

    alter table {{table_ref}}

    set options (labels=[

        ('created_in', 'dbt'),
        ('environment', '{{target.name}}')
        {% if (tags | length) > 0 -%},
            {% for tag in tags -%} ('dbt_tag_{{loop.index}}', '{{ tag }}') {% if not loop.last %},{% endif %}{%- endfor -%}
        {%- endif -%}
        {% if (meta | length) > 0 -%},
            {% for key, value in meta.items() -%}
                {% set search_key = re.sub(label_pattern,'', key|lower) %}
                {% set search_value = re.sub(label_pattern,'', value|lower) %}
                ('{{ search_key }}', '{{ search_value }}'){% if not loop.last %},{% endif %}

            {%- endfor -%}
        {%- endif -%}

        ]
    )

    {%- endif %}
{%- endmacro %}

Then you add this to your dbt_project.yml under models:

    post-hook: "{{ add_table_label_last_updated(this) }}"

What this version of the macro does:

  • It adds name of schema as a label
  • It adds a label that indicates it is made in dbt
  • It adds the dbt tags as labels
  • It adds any meta field as a label, key value pair.

The dbt bigquery label function is not working well. As you said, it does not inherit labels set on upstream models. But if we instead leverage the meta: functionality, and the tags: functionality, it works well.

1 Like

I did something similar to achieve this using post-hooks. The problem is that post-hooks are being executed on every run, for every model (additional overhead, especially for hourly and daily models). Specifying labels in dbt_project.yml adds labels only the first run for incremental models.

Also, it is destructive, as all the labels are being overridden using this macro, as the labels are being replaced. Consequently, labels can not be set from anywhere else (Airflow dag, etc.) but from this post-hook macro.

At first, I wanted to add (update) labels at every deploy to production, as the labels and meta can not change otherwise, using a simple python script. The problem are models, that are materialized as tables, because tables are being replaced on every run and labels disappear