Using dbt to manage user defined functions

hvignolo · February 2, 2024, 1:59pm

Hey! Hope you’re doing well.

I just wanted to add something that I think might be helpful to this excellent original idea.

The thing is, the on-run-start hook can add some seconds to each model/test run to apply all the UDFs. In my case, the overall sum of those seconds was around 10-15% of the entire dbt pipeline runtime, and I considered it was too much time for something that is a static thing.

To tackle this, I decided to do the same idea described here, but instead of executing it with the hook, it’s executed in a step of the CI pipeline, in case some file that matches the glob pattern macros/udfs/**/*.sql is modified. You may need to make some adjustments to your infrastructure (your runners), but it worthed a while for my case.

Thanks!

brabster · February 19, 2024, 10:38am

Hey, I’ve written an article on udfs as a custom dbt materialization, which allows you to incorporate them into the DAG and so solve most (all?) of the problems I’ve seen mentioned here elegantly.

Writeup here Materialized UDFs in a dbt World - Tempered Works

Example here pypi_vulnerabilities/models/published/udfs/matches_multi_spec.sql at main · brabster/pypi_vulnerabilities · GitHub

Topic		Replies	Views
Trigger an existing UDF in the database Archive	2	2180	March 15, 2022
Create schema with empty tables and redshift spectrum Help redshift , dbt-core	0	1464	April 10, 2023
Using Pre-Hooks to Define Table DDL Help snapshots , hooks , macros , dbt-core	3	1162	January 15, 2024
Building dbt models to be compatible with multiple data warehouses In-Depth Discussions	2	8658	October 15, 2022
[Obsolete] The exact grant statements we use in a dbt project Archive	4	35886	July 10, 2019

Using dbt to manage user defined functions

Related Topics