Modeling SQL update statements

:wave: Hey there! Welcome to dbt!

Based on the original post, here are a couple assumptions I’m making and an example approach below.

Assumptions:

  • blah is available to dbt as a source
  • We’re only interested in two columns (column_name, other_column)

With those assumptions, I’m thinking some Jinja is all you’d need to clean up the model.

Example:

This model, stg_foobar, basically replaces your create table by getting everything from blah loaded into foobar. You can schedule dbt to run this model as often as you need to make sure foobar has all the data needed from blah, assuming there is something in the omitted code that results in foobar not just being a copy of blah.

stg_foobar.sql

select * from {{ source(‘blah_container’, ‘blah_table’) }}

I imagine this next model, categorized_foobar, as a view, but it could be a table. The point here is that you don’t need to go back and constantly update any tables. Rather, you create this model to build on top of stg_foobar so that anytime data is read from categorized_foobar, it’s automatically using the evaluated columns.

The Jinja mappings is a variable you can use to cleanly define your mapping rules in a single place and prevent yourself from having to write a lengthy case statement.

categorized_foobar.sql

{% set mappings = {'something': 'boo', 'something-else': 'boo-else'} %}

with source as (
		select * from {{ ref(‘stg_foobar’) }}
),

final as ( 

		select
			case
			  {% for old, new in mappings %}
		        when other_column like ‘{{old}}’ then ‘{{new}}’
	          {% endfor %}
		    end as column_name
		from
			source

)

select * from final

Of course, if the complexity of the string matching is the crux of the problem here (that doesn’t stand out to me in the original question), we might look at another approach.

P.S. I’m drafting this up without access to run/test it, so you may run into some syntax issues, but I think conceptually it should work.