The problem I’m having
I’m trying to create a python model to transform some data and create a table
The context of why I’m trying to do this
I defined a python function in the same (.py) as my model and registered it within the model function in dbt but I get this error when I execute dbt run.
I’m registering it with the following code:
def register_udf_parse_data():
parse_data_udf = F.udf(
lambda column_data: parse_data(column_data),
return_type=T.VariantType(),
input_types=[T.VariantType()],
)
return parse_data_udf
My function looks like this:
def parse_data(column_data):
result = {}
for some_value in column_data:
key = some_value["key"]
value = some_value["value"]
....
....
return result
I’m doing the following in main after reading my table from db using dbt.source()
parse_data_registered = register_udf_parse_data()
df = df.withColumn("new_col", parse_data_registered('some_col'))
ModuleNotFoundError: No module named 'main_module'
in function SNOWPARK_TEMP_FUNCTION_VCZMD70DU6 with handler compute
in function name_of_function_PY__DBT_SP with handler main
What I’ve already tried
I have tried changing the materialization to incremental but that does not work as well
I have also tried creating a permanent udf but I get same error message.
Any suggestions/help will be appreciated