Generating Supernatural (Guaranteed Unique) Keys in Snowflake


#1

Why Unique Keys?

There are about 1 billion reasons not to use system natural keys ( ie the source record identifiers like user_id or account_number ) in a production data warehouse. The TLDR version is that your warehouse is where the buck stops - as the warehouse engineer you are responsible for the validity of unique identifiers, so it only makes sense to take full ownership of them.

So how do we generate unique keys?
Snowflake supports postgres-like sequences. A sequence is an RDBMS object that maintains a series of always-incrementing integers - so every time you ask a sequence for a number, you get one that is guaranteed to:
a. have never been given out before
b. be higher than any number given out before

You can create sequences just like in postgres:
CREATE OR REPLACE SEQUENCE your_sequence START = 1000
I like to start mine with a higher number just to make the resulting ID’s easier to read.

Then when calling the sequence, simply join it in and get the resulting nextval.

    SELECT 
        *,
        seq.nextval AS totally_unique_key
    FROM
        public.hamburgers_table, 
        table(getnextval(your_sequence)) seq

I’ll write more about surrogate keys vs ids, vs natural keys later - but for now this is how you can populate (and own) the uniqueness of your data instances.