Just wanted to note an issue I’ve run into recently where in some of my user defined functions, I’ve used ref() to indicate a dependency but it appears that the model calling the function doesn’t recognize the ref() within it.
As a result, whenever I’m creating the schema anew for an environment or for anyone setting up their environment for the first time will run into errors if dbt doesn’t run the models in the correct order.
Does the dependency mapping that DBT generate factor in ref() within user defined functions used by models?
As a workaround for now, I’ve created a macro which creates a CTE that selects true from the model dependency, limit 1 so that it inserts the ref into the model that needs it.
To put it a different way, if you ref models in something that runs during on-run-start, should those models not be processed first once model processing happens?
Hey @chanwd - can you share how you’re UDF is built in your dbt project? Are you indeed using an on-run-start hook? Or something else?
If there’s a way for dbt to understand the edge in the DAG, then I think we could make this work. I’d need to see what your code looks like to say for sure.
More generally, I’d advise “hoisting” the ref into your model code. It sounds like you’re doing that with a CTE, but you could also just do it with a SQL comment, eg:
-- models/my_model.sql
/* depends on: {{ ref('some_other_model') }} */
select
my_udf()
from ....
Alternatively, you could get clever and make a macro that:
adds a ref in a comment to build the DAG correctly and
returns the UDF code
{% macro call_my_udf(args) %}
/* depends on: {{ ref('my_model') }} */
my_udf({{ args go here }})
{% endmacro %}
This might make your compiled code look a little funky, but i did was to propose it as a suggestion