Optimizing grant statements via on-run-end or post-hooks (Snowflake)

I’m finally tackling my issue where the on-run-end grant statements take up most of the dbt run, instead of actually creating tables/views. For example, one view that takes 2 seconds to create, but the overall run is 22 seconds (edit: Using Snowflake):

14:32:37 | 1 of 1 START view model dbt_adam.sb_accounts......................... [RUN]
14:32:41 | 1 of 1 OK created view model dbt_adam.sb_accounts.................... [SUCCESS 1 in 2.60s]
14:32:51 |
14:32:51 | Finished running 1 view models in 22.27s.

These are my current on-run-end statements:

on-run-end:
    - 'grant usage on schema {{ target.schema }} to role reporter'
    - 'grant select on all tables in schema {{ target.schema }} to role reporter'
    - 'grant select on all views in schema {{ target.schema }} to role reporter'
    - 'grant usage on schema analytics.analytics to role bitool'
    - 'grant select on all tables in schema analytics.analytics to role bitool'
    - 'grant select on all views in schema analytics.analytics to role bitool'
    - 'grant usage on schema analytics.dbt_archive to role bitool'
    - 'grant select on all tables in schema analytics.dbt_archive to role bitool'
    - 'grant select on all views in schema analytics.dbt_archive to role bitool'

Our rules are that:

  • everything under analytics.* are available to role reporter
  • 2 schema, analytics.analytics and analytics.dbt_archive, are available to role bitool

Thoughts:

  1. I probably can comment out the last 3 statements for dbt_archive - one time grants should probably do it?
  2. I thought about moving grant select on all tables/views to a post-hook where grant select on {{this}} to role ... but how do I make sure that it obeys my rules above (bitool shouldn’t see anything in analytics.analytics)?
  3. I’ve seen other use grant select on all future tables/views on schema ... - do those future grants negate the need for post-hooks or on-run-ends by only needing to be run once? If so, that sounds swell?

Any advice is much appreciated. I’m getting tired of waiting an extra 20-30 seconds every time I’m fiddling with a single model…they sure add up. :wink:

1 Like

I would love to understand grant select on all future statements in Snowflake too! I can’t understand why I can sometimes run them and sometimes not. I didn’t see anything in the documentation on this. For example, if I have a schema owned by transformer role, I can grant select on all {tables|views} ... but I get permission denied when I try to grant future.

I think the confusing thing about grant select on future tables is that you have to run it as accountadmin, regardless of what role owns the table.

I use them and like them. However, if your automation drops and recreates schemas, it can be a little inconvenient, since you probably don’t want your automation working in the context of accountadmin.

@adamstone – sometimes I think you must have access to my Trello board, based on how well you time some of your questions!!

OK so I just did a full writeup over here which runs through the statements we run.

I was going to specifically address each of your thoughts but I think it might be easier to just instead propose an alternate solution! If the main issue is that you are tired of waiting for hooks to run when in dev, you can package your grant statements up into a macro, and have some logic that says “don’t run me unless i’m in prod”. I might have got the environment logic wrong, but you can see where I’m going with it!

# dbt_project.yml

on-run-end:
- "{{ grant_statements() }}"
-- macros/grant_statements.sql

{% macro grant_statements() %}
  {% if target.name == 'prod' %}
    grant usage on schema {{ target.schema }} to role reporter;
    grant select on all tables in schema {{ target.schema }} to role reporter;
    grant select on all views in schema {{ target.schema }} to role reporter;
    grant usage on schema analytics.analytics to role bitool;
    grant select on all tables in schema analytics.analytics to role bitool;
    grant select on all views in schema analytics.analytics to role bitool;
    grant usage on schema analytics.dbt_archive to role bitool;
    grant select on all tables in schema analytics.dbt_archive to role bitool;
    grant select on all views in schema analytics.dbt_archive to role bitool;
  {% else %}
    select 1; -- hooks will error if they don't have valid SQL in them, this handles that!
  {% endif %}
{% endmacro %}

I think there’s some further improvements you can make here, such as:

  • using the schemas variable instead of target.schema – when you upgrade to v0.14.0, the schemas variable will include the schema you build archives snapshots in, so you won’t need to hardcode it!
  • leveraging future grants to reduce downtime in your BI tool (I’m surprised this hasn’t been an issue yet!)

^ Have a read of the article I linked to for more information on how we leverage schemas and future grants.

To answer some of your more specific questions:

But to specifically address your thoughts:

I probably can comment out the last 3 statements for dbt_archive - one time grants should probably do it?

If you use the schemas variable and upgrade to 0.14.0, you don’t have to write specific logic for your dbt_archive schema!

Also I like the logic of “the thing that creates your schema should grant the privileges”, so including these grant statements feels like a good idea, even though you end up running the same statements again and again.

I thought about moving grant select on all tables/views to a post-hook like:
grant select on {{this}} to role ...
but how do I make sure that it obeys my rules above (bitool shouldn’t see anything in analytics.analytics bitool can only select from analytics.analytics.* )?

I’ve just made a slight edit to your question based on my understanding of your setup (lmk if I got that right)!

I personally prefer using on-run-end hooks; they are tidier than having post-hooks. As an aside, it is totally possible to implement your “rules” into post-hooks – you could nest your models in a subdirectory and only apply the post-hook to those models; or you can call a macro from your post-hook and have the logic about whether to grant privileges in the macro.

But, given that the main things you’re trying to solve here is (1) reducing run time in dev, and (2) tidying things up, I don’t think you should make this change.

I’ve seen other use grant select on all future tables/views on schema ... - do those future grants negate the need for post-hooks or on-run-ends by only needing to be run once? If so, that sounds swell?

Technically, those statements do only need to be run once (see my other article for more info), but because we implement them as on-run-end hooks they get run on every run! You could just run them manually outside of Snowflake and then not have them in your project, but again, I like having them in your project so it’s clear to others what statements need to be run!

I really think if your main issue is run time in dev, packaging your statements in a macro, and putting in some logic about only running your grants in prod will be the 80% fix!

And @mplovepop/@john, to run future grants, you need to have the manage grant privilege, check out the “Implementing this on Snowflake” section here.

I :100: agree about the usefulness of including those statements that “technically-only-need-to-be-run-once” in the on-run-ends or post-hooks so everyone can see what needs to be run to make everything work.

Incredible response, @claire, that answers all my question, and I think it also indirectly taught me that maybe I actually don’t need to grant privileges in dev at all (e.g. when I’m trying out new models) because, as the user that made the table/view, I don’t have to grant myself privileges on it. (a duh moment!) I think that’s where I was stuck on for a while, thinking I also had to do grant statements in dev (as long as nobody else needs to look at my dev schema).

THANK YOU!