TL;DR - get ready for any SQL business logic to fail - in looker, DBT, etc. I highly recommend doing the migration over the weekend, give yourself a full day or two of work to fix up differences.
I spent this past weekend migrating from postgres to Snowflake. Tristan has already written a post on on redshift to snowflake, but I thought I’d briefly jot down my notes if you are considering a postgres->snowflake move
- schema search path: I set the default schema in my DBT pre-run-hook, no can do in snowflake, so I used regex to append schemas to all mentioned tables (I know, I know, use sources).
- typecasting pain: I did a lot of
SUM(has_discounts) as n_discounts, and am having difficulty casting booleans to a number
- datetime pain - I had variables like
NOW() - MAX(table.created_at) AS time_since_update, and this magically showed up in Looker as “5 hours, 26 minutes…“. Snowflake, I need to
DATEDIFFand have to choose to look at minutes or hours or days.
- casing - snowflake has different casing preferences, that work very poorly with looker, which by default quotes all your column names. I removed all of the quotes in my LookML to fix this.
From Dan Hunt (DBT group)
I’ve also just done a Postgres -> Snowflake migration. The main points not mentioned that come to mind are related to Snowflake regular expressions:
Regex matching: the Postgres shorthand
column ~ 'regex'becomes
Regex captures: extracting based on a regular expression capture changes from
substring(column from 'regex')to
regex_substring(column, 'regex', 1, 1, 'ie')to get the first match in a string (or similar, depending on case sensitivity, capture group, etc.)
Regex escaping: in regex patterns, you need to double backslash to escape, so
\\.A few functions in Snowflake and not in Postgres have also been pretty helpful:
ifnull(a, b): I prefer this function for readability to the Postgres equivalent,
iif(condition, a, b): perhaps not everyone agrees here, but I prefer this to a
casestatement with only one condition for cleaner code.
- I haven’t used
try_castyet, but this is also missing an equivalent from Postgres and potentially a big one for cleaning up conversions near the source layer
From gitlab - https://gitlab.com/gitlab-data/analytics/issues/645
now()was replaced with
- UDFs had to be re-written
- date math is totally different; we can now leverage the native
dateaddto see the difference between two dates or a date and some interval (
dateaddcan also subtract by use of negative intervals)
- snowflake errors out on
currentis a protected word