Creating Custom Materialization

soliev · April 10, 2024, 8:58am

Developing a Custom Materialization Strategy

I’ve observed that, as of now, there are no existing materialization strategies that employ the TRUNCATE INSERT method with tables. Due to this gap, I am motivated to develop my own materialization strategy. Unfortunately, I’ve found that resources to guide me in this endeavour are somewhat scarce.

Code

{% materialization truncate_insert, adapter="default" %}

{% set target_table = this.schema ~ '.' ~ this.name %}

{{ run_hooks(pre_hooks, inside_transaction=False) }}
-- `BEGIN` happens here:
{{ run_hooks(pre_hooks, inside_transaction=True) }}

-- Truncate the target table
{% set truncate_sql %}
    TRUNCATE TABLE {{ target_table }}
{% endset %}

-- Insert the data into the target table
{% set insert_sql %}
    INSERT INTO {{ target_table }}
    {{ sql }}
{% endset %}

{% call statement('main') %}
  {{ truncate_sql }};
  {{ insert_sql }}
{% endcall %}

{{ run_hooks(post_hooks, inside_transaction=False) }}

-- `COMMIT` happens here
{{ adapter.commit() }}

{{return({'relations':[this]})}}

-- `BEGIN` happens here:
{{ run_hooks(post_hooks, inside_transaction=True) }}

{% endmaterialization %}

Challenges and Seeking Guidance

Although the code is operational and meets the intended objectives, it lacks certain elements, such as the tracking of the number of rows manipulated or performing adapter-specific actions. I’m eager to elevate my custom materialization to align with the “dbt” standard. Would anyone be able to recommend additional resources or guides to aid in bridging this gap?

Sample Output

15:35:14  Running with dbt=1.7.10
15:35:14  Registered adapter: postgres=1.7.10
15:35:14  Found 7 models, 2 tests, 1 source, 0 exposures, 0 metrics, 516 macros, 0 groups, 0 semantic models
15:35:14  
15:35:15  Concurrency: 1 threads (target='dev')
15:35:15  
15:35:15  1 of 7 START sql view model public.average_ride_per_month_raw .................. [RUN]
15:35:15  1 of 7 OK created sql view model public.average_ride_per_month_raw ............. [CREATE VIEW in 0.05s]
15:35:15  2 of 7 START sql view model public.filter_month_t .............................. [RUN]
15:35:15  2 of 7 OK created sql view model public.filter_month_t ......................... [CREATE VIEW in 0.03s]
15:35:15  3 of 7 START sql view model public.unfilter_months_t ........................... [RUN]
15:35:15  3 of 7 OK created sql view model public.unfilter_months_t ...................... [CREATE VIEW in 0.03s]
15:35:15  4 of 7 START sql truncate_insert model public.airport .......................... [RUN]
15:35:15  4 of 7 OK created sql truncate_insert model public.airport ..................... [INSERT 0 0 in 0.02s]
15:35:15  5 of 7 START sql view model public.average_ride_per_month ...................... [RUN]
15:35:15  5 of 7 OK created sql view model public.average_ride_per_month ................. [CREATE VIEW in 0.02s]
15:35:15  6 of 7 START sql truncate_insert model public.airport_morning .................. [RUN]
15:35:15  6 of 7 OK created sql truncate_insert model public.airport_morning ............. [INSERT 0 0 in 0.02s]
15:35:15  7 of 7 START sql truncate_insert model public.airport_unfiltered ............... [RUN]
15:35:15  7 of 7 OK created sql truncate_insert model public.airport_unfiltered .......... [INSERT 0 0 in 0.02s]
15:35:15  
15:35:15  Finished running 4 view models, 3 truncate_insert models in 0 hours 0 minutes and 0.27 seconds (0.27s).
15:35:15  
15:35:15  Completed successfully
15:35:15  
15:35:15  Done. PASS=7 WARN=0 ERROR=0 SKIP=0 TOTAL=7

References

mroy · April 24, 2024, 5:26pm

Why not just use a pre-hook to truncate the table?

Like this:

{{ config(
    pre_hook="TRUNCATE TABLE " ~ this,
) }}

And if you are using Snowflake, something that is even easier is to use INSERT OVERWRITE (which truncates & inserts in one single command).

soliev · April 24, 2024, 5:40pm

Would it overwrite materialisation strategy such as CREATE TABLE?

mroy · April 24, 2024, 5:54pm

No, but you could use the incremental materialization to avoid re-creating the table.

It would “incrementally” insert into an empty table.

Topic		Replies	Views
Custom materialization does not create an object without commit Help	0	501	February 26, 2024
Custom materialisation Help materialization , dbt-core	5	644	July 30, 2024
Create a model to truncate and load into DBT models Help bigquery	2	4258	May 4, 2024
Custom materialization that will decide is it table or view based on model name Help jinja , snowflake , macros , dbt-core	0	257	September 17, 2024
Custom materialization - Run DDL on snowflake -- Error Help	2	1336	January 25, 2023