The problem I’m having
I have a post_hook in each of my ~1950 table models that is adding a lot of time to our overall transformation processing. However, we only NEED the post-hook if there has been a DDL change to the table the model is updating.
The context of why I’m trying to do this
We have a business requirement for legacy support. In past solutions, we used 2 paired tables (e.g. ACTIVITY and ACTIVITY_RT) where the first table held historical data and the _RT table held the last 30 days of “real-time” data. With dbt and Snowflake, we are doing away with the multi-table model, but we still need an empty *_RT table for every table so that our customers’ UNION ALL queries don’t break when we roll out our new solution. Additionally, the table and its empty _RT counterpart must have the same columns or the UNIONs will fail.
What I’ve already tried
To handle this requirement, I built a post_hook into each model as follows:
{{config(
materialized = 'incremental',
enabled = true,
unique_key='unique_key',
tags=["dbt_run:hourly"],
cluster_by=['customer_schema'],
post_hook='CREATE OR REPLACE TABLE SECURE.activity_RT LIKE CURATED.activity'
)
}}
To generate this, I’m using the following .py code in dbt-core:
f.write("\n")
f.write("{{")
f.write("config(")
f.write("\n")
f.write(" materialized = 'incremental',")
f.write("\n")
f.write(" enabled = true,")
f.write("\n")
f.write(" unique_key='unique_key',")
f.write("\n")
f.write(f' tags=["{run_frequency}"],')
f.write("\n")
f.write(f" cluster_by=['customer_schema'],")
f.write("\n")
f.write(f" post_hook='CREATE OR REPLACE TABLE SECURE.{table['table_name']}_RT LIKE CURATED.{table['table_name']}'")
f.write("\n")
f.write(")")
f.write("\n")
f.write("}}")
The problem is that I only need to CREATE/REPLACE the _RT table if the DDL of its paired table has changed. Otherwise, this is just wasted resources in hour hourly transforms where we run all ~1950 models.
I have two questions:
- How do I condition my post_hook to run ONLY IF the model has DDL changes (new column, altered column datatype, etc.)?
- How do I set this up as a macro instead of hard-coding into each model?
Thanks in advance for your help!
Johnny