With dbt v1.2, much of this article is obsolete. Check out our updated permissioning guidelines on the dbt Developer Blog for our modern recommendations. - Joel
In a well maintained warehouse, your BI tools will need to be granted the privilege to read the tables and views dbt creates.
In this article, I’m going to run through the exact statements we run to grant privileges, and how we implement them in dbt, starting with a basic example and making it more flexible (but also more complex!) as we go.
In this article, let’s imagine we have a data warehouse that:
- Is a Redshift warehouse (I’m using Redshift to start with as I think it’s the most common data warehouse used within our community, not because I think it’s the best warehouse – I’ll include the equivalent SQL for Snowflake at the end of the article)
- Has a group named
reporter
, where every human (claire
,drew
,another_analyst_on_your_team
) and app (looker
,periscope
,mode
) that runs reporting queries is a member of thereporter
group. - Follows the convention that every table and view within a schema is created by the same process, has the same owner, and has the same privileges applied to it.
- Your dbt user creates, and therefore owns, the schema it is creating objects within (sometimes this isn’t true in larger organizations with different information security rules).
Using on-run-end hooks
For every schema that dbt creates, you’ll need to grant the following privileges:
- usage on the schema that your relations are in
- select on the tables and views within those schemas – you can either grant these individually to each table or view, or use the
on all
syntax
Let’s start by assuming that you are creating everything in one schema, named analytics
. The grant statements you’d need to run would be:
grant usage on schema analytics to group reporter;
grant select on all tables in schema analytics to group reporter;
You really only need to run that first statement once for each schema – however running it multiple times won’t hurt! (For anyone following along at home, this is an example of idempotence – running it once, or running it six times, will have the same effect). You’ll need to make sure the schema exists before you run it!
That second statement needs to be run at the end of a run; so that it gets applied to all the tables and views dbt has created (many years ago, I assumed that the “grant select on all” would apply to new tables in my warehouse too – evidently not, but we’ll get to that soon).
So, let’s implement these statements as an on-run-end
hook:
# dbt_project.yml
...
on-run-end:
- 'grant usage on schema analytics to group reporter'
- 'grant select on all tables in schema analytics to group reporter'
Great! Kind of…
Assuming that you’re using a different schema when running dbt locally (i.e. your dev
environment) compared to your prod
runs, these statements won’t make any sense for your dev
runs; in fact, they’ll probably fail (your local user won’t own the analytics
schema, and they aren’t a superuser (right?!) so they can’t grant privileges on a schema they don’t own).
Let’s fix this by using the target
variable to grab our target schema:
# dbt_project.yml
...
on-run-end:
- 'grant usage on schema {{ target.schema }} to group reporter'
- 'grant select on all tables in schema {{ target.schema }} to group reporter'
Better! Again, kind of…
The problem with using on-run-end
hooks here is that when dbt runs, behind the scenes it drops and recreates the table or view – this means that your model loses any exsiting privileges you’ve granted on it. As a result, for the period of time between when a model runs, and the end of the run, no one will be able to query that model, instead they’ll get a “permission denied” error. This creates downtime in your BI tool.
Using post-hooks to grant straight away
One way around this is to grant privileges on every model as soon as it is created, using a post-hook
, combined with the this
variable. {{ This }} looks like:
-- dbt_project.yml
...
models:
post-hook: 'grant select on {{ this }} to group reporter'
seeds:
post-hook: 'grant select on {{ this }} to group reporter'
on-run-end:
- 'grant usage on schema {{ target.schema }} to group reporter'
It’s worth nothing here that the first time dbt runs with a new target.schema
, your BI tools won’t have access to the relations until the end of the run. This is because the reporter
group doesn’t yet have usage
privileges on the schema. On any subsequent runs your BI tool will get privileges straight away, and even though the usage
statement is being run, it’s just reapplying a privilege that already exists. I feel very OK with this – it’s pretty rare that I want to query something from my BI tool the very first time I’m running dbt with a new target.
Leveraging default grants
Personally, I prefer to leverage default
grants (or future
grants in Snowflake – more on that below), this means that as soon as a relation is created in a schema, the warehouse applies the right privileges to it. If I were writing this DDL from scratch, I would run these statements right after I create a schema:
create schema analytics;
grant usage on schema analytics to group reporter;
alter default privileges in schema analytics
grant select on tables to group reporter;
But I very rarely run DDL from scratch (dbt does it for me!). So instead in dbt I use an on-run-end
hook, like so:
# dbt_project.yml
...
on-run-end:
- 'grant usage on schema {{ target.schema }} to group reporter'
- 'grant select on all tables in schema {{ target.schema }} to group reporter'
- 'alter default privileges in schema {{ target.schema }} grant select on tables to group reporter'
Since we ran this at the end of the run, there will be table and views already in your schema by the time you apply the default privileges. As such, we still have to run the grant select on all tables
statement, because default privileges only get applied to new objects being created going forward.
In reality, these statements only need to be run once for each schema (ever), but running them lots of times won’t hurt! And this way they are version controlled, and dbt will take care of running them whenever a new dbt user wants to use a different target schema.
Catering for multiple schemas
In dbt, you can use custom schemas to split the models and views dbt builds across multiple schemas. So far, we’ve only been granting privileges on the target schema, so your BI tool won’t be able to query any of the other things you’ve created. You might choose to leave things like this by design. We prefer to instead grant privileges to every schema dbt creates, by leveraging the schemas
variable.
Now our on-run-end
hooks look like:
on-run-end:
- '{% for schema in schemas %}grant usage on schema {{ schema }} to group reporter;{% endfor %}'
- '{% for schema in schemas %}grant select on all tables in schema {{ schema }} to group reporter;{% endfor %}'
- '{% for schema in schemas %}alter default privileges in schema {{ schema }} grant select on tables to group reporter{% endfor %}'
^ This is perfectly good code, but it’s not particularly readable! Especially since those for
loops are repeated. So let’s abstract it into a macro.
Packaging it up in a macro
We like to create a macro, grant_select_on_schemas
that packages up this logic for us!
-- macros/grant_select_on_schemas.sql
{% macro grant_select_on_schemas(schemas, group) %}
{% for schema in schemas %}
grant usage on schema {{ schema }} to group {{ group }};
grant select on all tables in schema {{ schema }} to group {{ group }};
alter default privileges in schema {{ schema }}
grant select on tables to group {{ group }};
{% endfor %}
{% endmacro %}
Then in we can call this macro in our post-hook, passing it our schema
variable and the name of our group:
# dbt_project.yml
on-run-end:
- "{{ grant_select_on_schemas(schemas, 'reporter') }}"
Implementing this on Snowflake
The exact same concepts apply on Snowflake, just with slightly different syntax. Notably,:
- you grant privileges to
roles
in Snowflake (if you’re interested, I’ve written more about the differences in roles across warehouses here) - you have to grant privileges to
tables
andviews
separately default
grants are namedfuture
grants, with a slightly different syntax, and requiring special privileges (see below)
We use the same macro name, calling it the same way in our dbt_project.yml
file, but the contents of the macro are different:
-- macros/grant_select_on_schemas.sql
{% macro grant_select_on_schemas(schemas, role) %}
{% for schema in schemas %}
grant usage on schema {{ schema }} to role {{ role }};
grant select on all tables in schema {{ schema }} to role {{ role }};
grant select on all views in schema {{ schema }} to role {{ role }};
grant select on future tables in schema {{ schema }} to role {{ role }};
grant select on future views in schema {{ schema }} to role {{ role }};
{% endfor %}
{% endmacro %}
To be able to grant future
privileges on Snowflake, your role needs to have the manage grants
privilege (see here) – you can apply this from the Snowflake UI as the SECURITYADMIN.
use role securityadmin;
grant manage grants on account to role transformer;
It is worth noting that this privilege is a global privilege – now anyone using the transformer
role can change grants on any object as though they are the owner of the object. Up to you if you’re comfortable with this! If not, you may want to use a combination of post-hooks
and on-run-end
hooks instead