The exact privileges we grant to set up Redshift

:sparkles: 2021 update :sparkles:
This article originally appeared as part of an article that covered grant statements for both Redshift and Snowflake (Snowflake version here). The original was written in 2019, when I was an employee of dbt Labs. Now, I’m freelancing (among other things), and actually got the opportunity to set up some new Snowflake databases recently. I took the opportunity to update some of my advice over on the Snowflake version of this article, and figured it was more logical to have two separate articles. Note that I haven’t used Redshift lately, but I think this advice still stands (until they implement cross-database queries for everyone, not just those using ra3 nodes).

Before reading through these exact instructions, take a moment to read some prior art:

  1. @jerco’s article on setting up the concepts behind sending up Snowflake for clients — we’re trying to apply a lot of the same ideas on Redshift, but with less flexibility (namely, no databases or warehouses)
  2. My article on database administration

While these article go deep into the “what” and “why” of database admin respectively, it was pretty light on the very specific “how” of setting this up, i.e. the implementation steps.

In this guide, I’ll run through the exact privileges it takes to set up your Redshift cluster the way these articles intended.

1. Set up the database

On Redshift, you can’t run cross-database queries, so we’ll do everything in one single database. By default, Redshift creates a database named dev, so let’s create our own one with a better name

create database analytics

2. Set up groups

We’ll grant privileges to these groups rather than to users directly

create group loader;
create group transformer;
create group reporter;

3. Create users, assigning them to their groups

Every human being and application gets a separate user, and is assigned to the correct shared role.

create user stitch -- or fivetran
    password = '_generate_this_'
    in group loader;

create user claire -- or amy, jeremy, etc
    password = '_generate_this_'
    in group transformer; 

create user dbt_cloud
    password = '_generate_this_'
    in group transformer;

create user looker -- or mode or periscope
    password = '_generate_this_'
    in group reporter;

4. Let loader load data

By granting a user the privilege to create a schema within a database, they will also then be able to create tables and views within that schema.

grant create on database analytics to group loader;
grant select on all tables in schema information_schema to group transformer;
grant select on all tables in schema pg_catalog to group transformer;

^ I’ve also snuck in some statements to give privileges to information schemas here – these are required by both your ETL tools (i.e. members of your loader group) and by dbt users (i.e. members of your transformer group).

5. Wait…

If you’ve just set up your database from scratch, you’ll have to wait a while here for data to start flowing before you can grant privileges for members of your transformer group to read it.

6. Let transformer transform data

Once your raw data is in your warehouse, you’ll need to give privileges for your transformer group to read it!

There are three parts to granting the privilege to read all relations within a schema:

  • Granting usage on a schema
  • Granting select privileges on all existing relations within a schema
  • Granting select privileges on all relations created in this schema in the future (default privileges), so that you don’t need to rerun grant statements for every new object in the schema.

On Redshift, this looks like:

-- for each schema:
grant usage on schema raw_stripe to group transformer;
grant select on all tables in schema raw_stripe to group transformer;
-- note that the `for user` clause should be adjusted based on the user who will be creating
-- objects in the schema (which should also be the schema owner).
alter default privileges for user stitch in schema raw_stripe
  grant select on tables to group transformer;

You’ll need to rerun these whenever you add a new schema to your database via Fivetran or Stitch

This group also needs to be able to create in the database:

grant create on database analytics to group transformer;

And needs some permissions to generate dbt docs

grant select on all tables in schema information_schema to group transformer;
grant select on all tables in schema pg_catalog to group transformer;

7. Let reporter read the transformed data

The reporter role should be able to read anything that dbt creates.

In Redshift, we implement these via an on-run-end hook – check out this article about the exact hooks we run!

8. Maintain!

When new users are added, make sure you add them to the right shared role! And as new schemas of raw data are added, ensure that privileges are granted to the transformer role to be able to read them.

2 Likes