Why Unique Keys?
There are about 1 billion reasons not to use system natural keys ( ie the source record identifiers like
account_number ) in a production data warehouse. The TLDR version is that your warehouse is where the buck stops - as the warehouse engineer you are responsible for the validity of unique identifiers, so it only makes sense to take full ownership of them.
So how do we generate unique keys?
Snowflake supports postgres-like sequences. A sequence is an RDBMS object that maintains a series of always-incrementing integers - so every time you ask a sequence for a number, you get one that is guaranteed to:
a. have never been given out before
b. be higher than any number given out before
You can create sequences just like in postgres:
CREATE OR REPLACE SEQUENCE your_sequence START = 1000
I like to start mine with a higher number just to make the resulting ID’s easier to read.
Then when calling the sequence, simply join it in and get the resulting nextval.
SELECT *, seq.nextval AS totally_unique_key FROM public.hamburgers_table, table(getnextval(your_sequence)) seq
I’ll write more about surrogate keys vs ids, vs natural keys later - but for now this is how you can populate (and own) the uniqueness of your data instances.