Converting legacy sql build scripts into models

ericpp · August 3, 2021, 10:13pm

Hello,

I’m currently working to convert legacy sql build scripts into dbt models. Most of the scripts start out with a simple CREATE TABLE xxx AS SELECT <columns> FROM table statement and then run subsequent UPDATE statements to make changes to the values of the table.

I’m currently handling this in dbt by packaging each update into its own CTE, e.g.:

WITH source AS (
  SELECT * FROM ref('source')
),
fix_col1_value AS (
  SELECT
    {{ dbt_utils.star(ref('source'), except=["col1"]) }},
    CASE
      WHEN othercol = 'val' THEN 'newval'
      ELSE col1
    END AS col1
  FROM
    source
),
fix_col2_value AS (
  SELECT
    {{ dbt_utils.star(ref('source'), except=["col2"]) }},
    CASE
      WHEN someothercol = 'val' THEN 'newval2'
      ELSE col2
    END AS col2
  FROM
    fix_col1_value
),
...
final AS (
  SELECT * FROM fix_colX_value
)
SELECT * FROM final

Is this a good approach for handling something like this? I do like that the individual updates from the old script are still separated out as CTEs rather than being jumbled together into one SELECT statement.

The code itself feels a bit long with a lot of repeated {{ dbt_utils.star() }} calls. It also feels potentially error prone if someone adds a new CTE and forgets to update the chain of CTEs.

Thanks,
Eric

Topic		Replies	Views
Modeling SQL update statements Archive	7	15119	October 2, 2021
Is DBT right choice for SQL pipeline? Archive	1	2178	September 9, 2021
🚀 Introducing `cte2dbt`: effortlessly convert SQL CTE into dbt models Show and Tell dbt-utils	0	313	February 22, 2025
DBT model that contains both SQL and Python Show and Tell testing , snowflake , dbt-core	0	257	February 4, 2025
Modeling existing multistep ETL process Help best-practice , modeling , dbt-core	0	581	February 6, 2024

Converting legacy sql build scripts into models

Related topics