Managing Authorization in Google BigQuery

eevans · October 14, 2020, 3:40pm

TLDR: programmatically manage dataset authorization and authorized view creation with a script that parses the dbt yamls and the manifest.

the Why: Here at Butterfly Network, we use Google BigQuery as our Datawarehouse. We structure out warehouse following the approach of raw, prc, and mart datasets managing all of our transformations with DBT. For the most part we grant end-users access to our mart datasets. Some of these are views on top of raw data that could include pii or other information that the end-user is not privy to. For BigQuery the obvious solution was authorized views.

We decided that a PR with DBT code was valid to say that the code is valid and the views that are created are authorized. We created a script that would parse the manifest and authorize the views that were created by DBT, such that the developer had to do no further work or remember to explicitly grant anything on a post hook.

Since we were dabbling in a script to authorize datasets, we also decided that we wanted to include the ability to specify who had access to which dataset in the schema.yml files. This enables our analytics team to have a one-stop shop to do all of their work and we can track when access was granted and why.

Here’s how we have our analyst specify access:

models: 
 -name: <XXX>
...
datasets:
 - name: mart_<XXX>
   access:
     - <group1>@butterflynetinc.com
     - <group2>@butterflynetinc.com
     - <group3>@butterflynetinc.com

and here’s the script we use to permission https://gist.github.com/emilyevans-bn/9055295f5602806c6fab013ac86711b7

A few call outs: we only permission groups to access the datasets in this way and we do not handle ephemeral models in our parsing because we don’t really use them

Hope this is helpful to someone!

eevans · May 14, 2021, 5:05pm

I cleaned up old gists a while back and lost the above, so reposted here dbt - bq dataset authorizer · GitHub

Topic		Replies	Views
Feature Request: BigQuery Authorized Dataset In-Depth Discussions	0	83	January 24, 2025
Set grants on dataset level in BigQuery Help grants	4	1859	October 4, 2024
Grant & revoke in Big query Help incremental , bigquery , grants	3	1594	January 11, 2023
Set schema/dataset for BigQuery Help bigquery , dbt-cloud	4	343	August 4, 2024
[Obsolete] The exact grant statements we use in a dbt project Archive	4	39357	July 10, 2019

Managing Authorization in Google BigQuery

Related topics