Repeated CTEs, DRY and intermediate models good practice

Hi everyone,

I am refactoring some legacy SQL codebase at my company and after reading “How we structure our dbt projects”, something does not click for me regarding intermediate models.

I have a case where 2 marts models have a big portion of repeated code in a CTE. Actually, the person who wrote the code even created a temporary table (we’re using BQ) for this CTE. So this same TEMP table is created 2 times in 2 different models.

To me, the logical thing to do was to move this repeated piece of code in an intermediate model that would be ref’ed by the 2 marts models. However, the chapter of the guide regarding Intermediate Models advocates for a 1 to 1 relationship between intermediate and mart models.

I understand that if I would move my CTE into an “ephemeral” intermediate model (as it is recommended), we would compute the same thing twice. However, according to the guidelines, I don’t see where else this code should live.

Of course guidelines are just guidelines but I’m curious as to what is the dbt way to handle this. Moving it to a mart model even if it’s not really aimed to end users? And what about CTEs that are reused in many different models?



Create a new folder and call it
Put there what is shared on the downstream and use table as materialization
Just avoid cycle between intermediate and final models

I’m in a similar situation as the OP and this does not seem to solve the problem as I understand it: two pipelines share code (e.g. a big CTE), how do you adhere to the DRY principle in this case? Do you have to resort to a macro? That seems a bit unwieldy.