Python model injecting SQL

Welcome everybody! :slight_smile:

The problem I’m having

I just started using python models and right on the first one I am experiencing some weird behaviour. My model bases on an SQL model (which I think is valid given the documentation here). However, when generating the .py file, it injects pure SQL into the code

Input (model.py)


def model(dbt, session):
    dbt.config(materialized = "table")
    df = dbt.ref("my-sql-model")

    dx = df
    ...

Output (.py file on GCS)

from pyspark.sql import SparkSession

spark = SparkSession.builder.appName('smallTest').getOrCreate()

spark.conf.set("viewsEnabled","true")
spark.conf.set("temporaryGcsBucket","my-bucket-name")

with __dbt__cte__dictionary__xxx as (
select
    something1 as column1,
    something2 as column2
from
    `my-project`.`my-dataset`.`my-table`
where
    xxx
)def model(dbt, session):
    dbt.config(materialized = "table")
    import pandas as pd
    df = dbt.ref("my-sql-model")

    dx = df

Setup

BigQuery, GCS, dbt Cloud v1.3

There is not much I was able to do here as I really don’t understand where it is coming from. Did you ever face that issue before?

I spent some more time with it and I noticed that that the base model being set to ephemeral is causing this issue. Probably, I should’ve caught it much earlier cause the SQL is pretty distinct.

Still, isn’t it somewhat of a bug? For example, ephemeral model could still be passed as a python query or there could be a validation that would prevent from using such models in python models.

1 Like

You’re correct that your issue is because you’re ref-ing an ephemeral model. I would open an issue on the core repo for this - I suspect this is a bug as opposed to an intentional decision to not support accessing ephemeral models. If it is intentional then it should have a better error message at least!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.