refactor code and get the same TFV function as the generated code with variable passing

The problem I’m having

I am currently utilizing an on-the-fly data transformation that is based on a webrequest_id ID. To put it simply, I’m employing a Table-Valued Function (TVF), which looks something like this:

select * from get_rank_for_web_request(webrequest_id);

The body of the function comprises over 200 distinct transformation logic operations. These are primarily responsible for performing ranking calculations. The function appears somewhat as follows:

create or replace function get_rank_for_web_request(webrequest_id string)  
RETURNS TABLE(
 webrequest_id string,
 rank int,
 rankonrock int,
 scanbackrank decimal(12, 4)
) return

with get_request as (

select * from request_table where request_id = webrequest_id

)
get_rank_for_reuest (

select * from rank_table where request_id = webrequest_id
)
additional 200 cte


,

select  * from rank;

now looking to refactor this code using dbt , with each (CTE) serving as a ephemeral model. However, I’m unsure how to dynamically filter the CTEs, how to create the function, and how to effectively reproduce the same functionality as the Table-Valued Function (TVF).

Go with ephemeral models if you have a reusable logic.

If this logic is only specific to this model then i advice you write the logic in a single model and use var function to dynamically filter in ctes


with get_request as (

select * from request_table where request_id = {{var("webrequest_id"}}

)
get_rank_for_reuest (

select * from rank_table where request_id = {{var("webrequest_id"}}
)
additional 200 cte


,

select  * from rank;

Use --vars argument with dbt run command to pass webrequest_id

dbt run model_name --vars 'webrequest_id: 1234566'

1 Like

Variables are not helping here. The end goal here is to create the function. the function is the one that will get queried against

You can try creating a macro to replace the function and use it inside a CTE, then query against the CTE

something like

-- my macro file
{% macro get_rank_for_web_request(webrequest_id) %}
    
    with get_request as (
    
    select * from request_table where request_id = {{ webrequest_id }}
    
    )
    get_rank_for_reuest (
    
    select * from rank_table where request_id = {{ webrequest_id }}
    )
    additional 200 cte
    
    
    ,
    
    final as (select  * from rank)
{% endmacro %}

Then build your model like

-- my model file

{{ get_rank_for_web_request(webrequest_id) }}

select * from final
1 Like

Yes but that’s like brining the whole code into the UDF is there any way that i can refere the model from macro? like

{% macro get_rank_for_web_request(webrequest_id) %}
    
select * from ref();
{% endmacro %}

Yes, you can

{% macro get_rank_for_web_request(webrequest_id) %}
 
...
select * from {{ ref('my_model') }}

{% endmacro %}

You can also pass the model as an argument

{% macro get_rank_for_web_request(webrequest_id, my_model) %}
    
select * from {{ ref(my_model) }}

{% endmacro %}

:zap: If you are creating ephemeral models for ctes, then u can’t refer them in macros using {{ ref() }} function.

If the macros called via dbt run-operation cannot ref() ephemeral nodes

1 Like

Hi Surya, you mean macros called via dbt run-operation, right? Materializations | dbt Developer Hub

Or any macro? (I’ve never tested it)

That’s right it is not possible with CTE.
are there any other options? i just want the generated code for the function based on CTE and ephemeral

macros called via dbt run-operation cannot ref() ephemeral nodes

I didn’t quite understand the problem above, your function can be converted to a macro that will generate the SQL code based on your webrequest_id, then you can query against the final CTE generated by the macro

What you see as the downsides of this solution?

the problem is that the function is deliverable,

create or replace function get_rank_for_web_request(webrequest_id string)  
RETURNS TABLE(
 webrequest_id string,
 rank int,
 rankonrock int,
 scanbackrank decimal(12, 4)
) return

with get_request as (

select * from request_table where request_id = webrequest_id

)
get_rank_for_reuest (

select * from rank_table where request_id = webrequest_id
)
additional 200 cte


,

select  * from rank;

this function will be delivered. so later in another application we can call only the function through JDBC connection like the following:

select * from get_rank_for_web_request('1111')

it is like creating a function which its body is a generated sql code.

1 Like