creating UDF in dbt-python models with Snowflake as database

The problem I’m having

I’m trying to create a python model to transform some data and create a table

The context of why I’m trying to do this

I defined a python function in the same (.py) as my model and registered it within the model function in dbt but I get this error when I execute dbt run.

I’m registering it with the following code:

def register_udf_parse_data():
    parse_data_udf = F.udf(
        lambda column_data: parse_data(column_data),
        return_type=T.VariantType(),
        input_types=[T.VariantType()],
    )
    return parse_data_udf

My function looks like this:

def parse_data(column_data):
    
    result = {}
    
    for some_value in column_data:
        key = some_value["key"]
        value = some_value["value"]
    ....
    ....

    return result

I’m doing the following in main after reading my table from db using dbt.source()

    parse_data_registered = register_udf_parse_data()
    
    df = df.withColumn("new_col", parse_data_registered('some_col'))
ModuleNotFoundError: No module named 'main_module'
     in function SNOWPARK_TEMP_FUNCTION_VCZMD70DU6 with handler compute
     in function name_of_function_PY__DBT_SP with handler main

What I’ve already tried

I have tried changing the materialization to incremental but that does not work as well
I have also tried creating a permanent udf but I get same error message.

Any suggestions/help will be appreciated

Found a solution to this. Turns out the current dbt-python does not support importing a python function from another module or folder. I had to define the function in a one liner using lambda after that I was able to register the function on snowflake.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.