Formatting single column with large number of updates

I have an address column that needs to be formatted.
There are 20+ updates that would usually take place, an example of which can be found below:

 'COURT':'CT'
 'STREET':'ST'
 'PLACE':'PL'
 'DRIVE':'DR'
 'AVENUE':'AVE'
 'ROAD':'RD'
 'TERRACE':'TER'
 'LANE':'LN'
 'HIGHWAY':'HWY'
 'BOULEVARD':'BLVD'
 'NORTH':'N'
 'EAST':'E'
 'SOUTH':'S'
 'WEST':'W'

I would usually run multiple UPDATE staements such as:

SET [Address] = REPLACE([Address],'Street', 'ST')
WHERE CHARINDEX('Street', [Address]) > 0

How would I perform these updates in a ‘dbt way’?

CASE WHEN has been suggested in the documentation: UPDATES

I don’t believe CASE WHEN is suitable as a single row may require multiple updates.

I could use a ridiculously long REPLACE statements :

REPALCE(
	REPLACE(
		REPLACE(
			REPLACE(Address,'COURT','CT'),
		'STREET','ST'),
	'PLACE','PL'),
'DRIVE','DR'),.......

Are there any alternatives to the long REPLACE statement, am I missing something?

You could make a macro that holds the dictionary of replacements and then jinja-loops to build a gnarly replace() statement.

In general, though, be careful when doing naïve replacements on address data. You probably wouldn’t want to translate “Midriver, Georgia” into “MiDRr, Georgia” :wink:

Note: @Owen originally posted this reply in Slack. It might not have transferred perfectly.

1 Like