How to handle generated columns in target model in incremental materialization

ypc-faros · May 11, 2022, 7:25pm

Hello!

Quick post for anyone wondering how to make incremental materialization work with generated columns. Context: in Faros CE, we model and EngOps entities (PRs, Incidents, …) in PostgreSQL for use in BI and automation. We have some columns that are automatically generated by extracting data from some other columns storing jsonb like this:

alter table "ims_Incident" add column "statusDetail" text generated always as (status ->> 'detail') stored;

We have a transform that tries to incrementally write to that ims_Incident table. With the incremental macro as is, the generated sql for the task to incident transform looks like this:

delete from "faros"."public"."ims_Incident"

where (id) in (

select (id)

from "ims_Incident__dbt_tmp141746223081"

);

insert into "faros"."public"."ims_Incident" ("id", "origin", "refreshedAt", "uid", "title", "description", "url", "severity", "priority", "status", "createdAt", "updatedAt", "acknowledgedAt", "resolvedAt", "source", "severityDetail", "severityCategory", "priorityDetail", "priorityCategory", "statusDetail", "statusCategory")

(

select "id", "origin", "refreshedAt", "uid", "title", "description", "url", "severity", "priority", "status", "createdAt", "updatedAt", "acknowledgedAt", "resolvedAt", "source", "severityDetail", "severityCategory", "priorityDetail", "priorityCategory", "statusDetail", "statusCategory"

from "ims_Incident__dbt_tmp141746223081"

)

Because it includes generated columns (e.g., statusDetail), this query will fail.

We decided to modify the macro to accept a list of columns to ignore when generating these queries. We adapted the main incremental macro.

Instead of using every single column (adapter.get_columns_in_relation(existing_relation)) in the existing table to generate the query, we ignore the ones specified via the “ignore_columns” configuration for that table (if any).

The change was pretty straightforward. Although to the best of our knowledge it is not documented, having such a macro in your macros folder will override the default one. We can then pass the list of columns to ignore via a configuration in the transform.

After these changes, the generated query looks like below:

delete from "faros"."public"."ims_Incident"

where (id) in (

select (id)

from "ims_Incident__dbt_tmp142134657426"

);

insert into "faros"."public"."ims_Incident" ("origin", "refreshedAt", "uid", "title", "description", "url", "severity", "priority", "status", "createdAt", "updatedAt", "acknowledgedAt", "resolvedAt", "source")

(

select "origin", "refreshedAt", "uid", "title", "description", "url", "severity", "priority", "status", "createdAt", "updatedAt", "acknowledgedAt", "resolvedAt", "source"

from "ims_Incident__dbt_tmp142134657426"

)

The columns specified in the configuration are no longer included.

martinburch · May 11, 2022, 8:05pm

Nice! Could you please show a sample use of the macro invocation in the model file—is it something like …

{{
    config(
        materialized='incremental',
        unique_key='id',
        ignore_columns=['id']   
    )
}}

… at the top of the model file?

ypc-faros · May 11, 2022, 8:16pm

Yes, it looks like this:

{
  config(
    alias='ims_Incident', 
    materialized='incremental',
    unique_key='id',
    ignore_columns={
      'ims_Incident': [
        'id', 
        'severityDetail', 
        'severityCategory', 
        'priorityDetail', 
        'priorityCategory', 
        'statusDetail', 
        'statusCategory'
      ]
    }
  ) 
}}

You can see the model here

(I couldn’t link to it in the original post because some limit in the number of links for new accounts)

martinburch · May 11, 2022, 8:17pm

Ah, thanks, I’m glad I asked. Why is that extra level of specifying the table name necessary in ignore_columns?

ypc-faros · May 11, 2022, 8:41pm

Thanks for asking!

Because I needed the workaround for another model

We also have a write up where we talk about the actual use cases here

Topic		Replies	Views
Dynamic Column Generation and Incremental Updates Help	2	245	December 2, 2024
Macro adding metadata to incremental model Help	0	19	March 4, 2025
Enquiry on incremental materialisation Help incremental	4	1384	November 30, 2022
Do all columns in target have to be in the model Help jinja , bigquery , macros , dbt-core	4	108	February 3, 2025
Ephemeral models, snapshots and new columns Help snapshots , ephemeral-models	0	115	February 26, 2025

How to handle generated columns in target model in incremental materialization

Related topics