Issue passing a ref table name to a custom materialization

I’m building a custom materialization, and one of the parameters it needs is the name and schema of a separate table (different from the target one).

I’m calling the materialization in this way:

{{
config(
materialized = ‘unload_by_partition’,
loop_table = ref(‘OUT_AREAS_LIST’).identifier,
loop_schema = ref(‘OUT_AREAS_LIST’).schema
)
}}

and in the materialization I’m retrieving the paramter values in this way:

{% set loop_table = config.get(‘loop_table’) %}
{% set loop_schema = config.get(‘loop_schema’) %}

If I pass the loop_table paramter as a string everything works perfefctly.
When I use the ref().identifier syntax (which I need to retain the two models dependency), the materialization gets the wrong parameter value.
I logged the parameter value just after the config.get, and what I see in the log is the target table name, like if the parameter is pointing to {{ this }}.

I don’t really understand what’s happening…

Hey @daniele.frigo - this is a super funky issue and it’s definitely confusing! The thing happening here is that dbt runs in two different modes:

  • parsing
  • running

During parsing, dbt is going to try to find all of the ref(), source(), and config() function calls in your model code in order to determine the shape of the graph and the configurations for your models.

The big problem is that at parse time, dbt has not necessarily walked your entire project to find models yet! This means that when you have some code like:

ref('OUT_AREAS_LIST')

dbt is going to have no idea:

  1. if this OUT_AREAS_LIST model exists
  2. what the configured identifier/schema/database is for the model
  3. what the materialization is (eg. is it a table/view, or is it ephemeral?)

As such, dbt is just going to return a junk/placeholder Relation object for refs called at parse-time (basically just this). The big issue is that dbt captures the values provided to config() at parse-time too, so these junk values are being saved as configs which will then be utilized at run-time :confused:

dbt doesn’t really have a good way to represent the thing you’re trying to do today, but it is a great idea, and I’d very much like to support it! I think we might want to re-process the config() call at model runtime, capturing the true config values and then passing them on to the materialization.

What do you think about something like that?

1 Like

I don’t have such a deep understading of internal dbt mechianics, but given your explanation this seems to be a good idea.
Is it the same approach you use to process the refs() included in the main sql of each model?

@drew, did you have a chance to look at this?
do you think it could something you’ll manage in a future release of dbt?
Thanks
Daniele

I sure do! It’s not totally clear what the right path forwards is here, but ideally, your code should work exactly as-is.

I think there are two approaches we could employ here:

  1. Re-capture the config() call at runtime and overwrite the (incorrect) configuration that was determined at parse-time with the correct configuration
  2. Make ref() more special at parse-time. We could capture that you’ve used a ref() in a config, then do some post-processing to “fix” that config once we’ve parsed the entire project

Of these two options, I think the first one is probably easier to implement and generally aligned with how we think about parsing/running models. This is something I’d definitely like to prioritize in a future release of dbt!

1 Like

Hi drew, did you have a chance to look at this issue?
Do you need I open an issue on github?
Daniele