Saving compiled macro to be run afterwards to avoid nested jinja

The problem I’m having

I’m trying to create a macro that creates CTE’s inside based on the number of seed files it finds with a specific pattern in their name while using ref function to have lineage. If I try to do this in one step it results in nested jinja and the ref is not translated into SQL.

The context of why I’m trying to do this

We are in the middle of migrating from sql server to snowflake and there is a report that needs updating, but doesn’t have a source table in the raw db yet. Every update comes in multiple csv files that we use as seeds and every time multiple CTE’s need to be added to the stg model to account for the new seeds.

What I’ve already tried

I have a script that compiles perfectly, but since it results in nested jinja, it doesn’t run successfully.
The problem comes from “FROM {{ begin }}”{{ seed }}“{{ end }}” compiling into {{ ref(“foo1”) }}, {{ ref(“foo2”) }} etc. for each CTE.
It is a mandatory requirement at my work to keep lineage, so unfortunately, I cannot use regular “db.schema.table” reference which would solve my problem.

I’m wondering if there is a way to save the compiled version of my macro as a model on run that runs separately afterwards to gain another round of compilation allowing to work around the nested jinja problem.

Some example code or error messages

{%- set seeds=[] -%}
{%- for node in graph.nodes.values()
    | selectattr("resource_type", "in", ["seed"]) -%}
    {%- if 'foo' in node.name -%}
        {%- do seeds.append(node.name) -%}
    {%- endif -%}
{%- endfor -%}

{%- set begin="{{ ref(" -%}
{%- set end=") }}" -%}

{%- for seed in seeds -%}
    {%- if loop.first %}
    WITH
    {%- endif %}
    {{ seed }} AS (
        SELECT
            bar1,
            bar2,...
        FROM {{ begin }}"{{ seed }}"{{ end }}
    ),
{% endfor %} 

Seeds really isn’t intended to load source data to DW:
https://docs.getdbt.com/docs/build/seeds#overview
> Poor use-cases of dbt seeds:
> • Loading raw data that has been exported to CSVs
Given this:
> We are in the middle of migrating from sql server to snowflake and there is a report that needs updating, but doesn’t have a source table in the raw db yet. Every update comes in multiple csv files that we use as seeds and every time multiple CTE’s need to be added to the stg model to account for the new seeds.
I understood that you have the source data available as csv files. Look into using Snowflake COPY INTO:
https://docs.snowflake.com/en/sql-reference/sql/copy-into-table

Or since you have used seeds, I’m guessing the data size isn’t too much. You could consider: https://docs.snowflake.com/en/sql-reference/sql/create-external-table

Or automating the ingestion using Snowpipes:
https://docs.snowflake.com/en/user-guide/data-load-snowpipe-intro

Snowpipes and external tables can be managed with dbt_external_tables:
https://hub.getdbt.com/dbt-labs/dbt_external_tables/latest/

Note: @Mikko Sulonen originally posted this reply in Slack. It might not have transferred perfectly.