help with split_part macro

Hi All

I am using the split_part macro in dbt with the dbt-glue adapter

I am getting an error pasted at the end of the post.
I know the problem is with escape characters and I know the fix. However I am not sure where to make the fix. The split_part macro in https://github.com/dbt-labs/dbt-core/blob/main/core/dbt/include/global_project/macros/utils/split_part.sql
does not include the call to split() which is mentioned in the error. Not sure where I can find the exact macro definition

Error:-
split(
06:34:30 thread_id,
06:34:30
06:34:30
06:34:30 – escape if starts with a special character
06:34:30 case when regexp_extract(‘-’, ‘([^A-Za-z0-9])(.*)’, 1) != ‘_’
06:34:30 -------------^^^
06:34:30 then concat('', ‘-’)
06:34:30 else ‘-’ end
06:34:30
06:34:30
06:34:30 )[(1)]

What you have linked to is the default implementation of split_part() which will be used in the absence of an overriding implementation (either from the adapter or project). I don’t see an overriding implementation in the dbt-glue code.

Note: @Owen originally posted this reply in Slack. It might not have transferred perfectly.

How are you invoking {{ dbt.split_part() }}? Does the compiled SQL script of your model contain valid syntax for your Glue database platform?

If you want to override the behavior of the split_part() macro, you just need to create a macro glue__split_part() in your project. (and maybe update your dispatch configuration)

Note: @Owen originally posted this reply in Slack. It might not have transferred perfectly.

Thanks for your replies.
From the error message, I see that the default implementation of split_part is calling split()
I want to override the way escaping is happening in my glue__split_part macro. However for that I want to obtain the default implementation of split_part.
This one https://github.com/dbt-labs/dbt-core/blob/main/core/dbt/include/global_project/macros/utils/split_part.sql
doesn’t contain the call to split. Hence I am confused

It seems that Glue extends the dbt-spark adapter, so you are getting the Spark-specific implementation of split: https://github.com/dbt-labs/dbt-spark/blob/main/dbt/include/spark/macros/utils/split_part.sql

As Owen said, you should create a macro called glue__split_part() in your project to have it picked up in favour of the spark version.

Hi Joel

This immensely helps.
So is it the below configuration that dictates that glue will use the spark adapter if a method does not exist for glue?

    install_requires=[
        "dbt-core~={}".format(dbt_version),
        "dbt-spark~={}".format(dbt_spark_version),
        "waiter",
        "boto3 >= 1.28.16"
    ],

Thanks again!

Regards,
Subu

1 Like

Yes - I’m not sure if that’s exactly the part that actually opts in to multiple dispatch, but it’s a sign that it does happen.