How Drizly is improving collaboration with external partners using dbt & Snowflake shares
Drizly is the leader in alcohol e-commerce. We partner with local retailers so you can shop the large selection of beer, wine and liquor across stores, right from your phone, and get it delivered in under 60 minutes. Along with providing this service, we strategically partner with alcohol brands and suppliers. These include companies like ABI, who own brands such as Bud Light, Budweiser, and Michelob, or Diageo, who owns many liquor brands such as Casamigos, Bulleit, and Smirnoff. These partners are interested in Drizly’s unique data set for many reasons. It can be used to segment marketing campaigns, fix inventory gaps, and just create a better overall customer experience.
We used to share this data via adhoc SFTP jobs. This wasn’t scalable (each new partner required a lot of work), and was tedious to get right, often resulting in chain emails with 50+ messages to get the details right.
Since migrating to Snowflake, we started leveraging their Data Sharing features to share data with our suppliers securely, and in a scalable way.
I recently spoke about this on an Office Hours (slides):
This has removed an enormous headache for us while creating a ton of value for our partners. Our approach will be useful for just about any business that works with external partners:
- SaaS companies who want to provide data to their customers (e.g. an HR software company that wants to enable customers to query their own data)
- Marketplaces / platforms that want to share data with their suppliers, who often can’t get this data themselves
- Organizations that have strategic partnerships and want to share data securely
Creating a share manually
The below SQL gives you an idea of how to create a share manually.
create or replace share sales_s;
alter share sales_s add accounts=xy12345;
grant usage on database analytics to share sales_s;
create schema if not exists sharing;
create secure view analytics.sharing.sales as (
select ...
);
grant usage on schema analytics.sharing to share sales_s;
grant select on view analytics.sharing.sales to share sales_s;
There’s a few things to note about shares:
- Only the
accountadmin
role can run acreate share
command - When sharing a view, it must be a
secure
view
And for our use case, we had a few extra considerations:
- We only want to share prod data, not dev ones
- Suppliers must only be able to read their own data
But manually creating shares was not enough. Instead, we chose to use dbt for this process because:
- By version controlling our code, we implement extra security around what data gets shared and with which account
- By using dbt hooks to run this code, this process becomes scalable — the effort to add one (or a dozen!) new suppliers is minimal.
So, on to dbt!
Using dbt to automate this process
Step 1: Create reader accounts
Create a reader account for a supplier so they can login to Snowflake. We use terraform to automate and version control this process.
Step 2: Map internal supplier IDs to Snowflake accounts IDs
Use a seed for this. This means the list of accounts are version controlled, reducing the risk of any mistakes happening.
seeds/supplier_share_access_rules.csv
company_id,account_id
4,FKA85312
5,UVZ23642
6,RPX31467
Step 3: Create the share
We use a pre-hook to do this. To make things tidier (and so that we could re-use it across multiple models), we bundled the code into a macro.
models:
sharing:
pre-hook:
- "{{ create_share('drizlys_share', ['FKA85312', 'UVZ23642', 'RPX31467'] }}"
Our macro also includes logic to make sure this only runs in prod (leveraging the target variable). And the SQL uses the if not exists
syntax to avoid errors.
-- macros/create_share.sql
{% macro create_share(share_name, accounts) %}
{% if target.name == "prod" %}
{% set sql %}
CREATE SHARE IF NOT EXISTS {{ share_name }};
GRANT USAGE ON DATABASE {{ target.database }} TO SHARE {{ share_name }};
{% for account in accounts %}
ALTER SHARE {{ share_name }} ADD ACCOUNTS = {{ account }};
{% endfor %}
{% endset %}
{% set table = run_query(sql) %}
{% endif %}
{% endmacro %}
Step 4: Create the view to be shared
We only want suppliers to be able to see their own data. Rather than create a model for each supplier that we share separately, we use Snowflake’s current_account() function to create a filter, joining back to our seed file to make sure we are filtering correctly.
SELECT *
FROM {{ ref('store_order_items_supplier_vw') }}
WHERE company_id IN (SELECT company_id
FROM {{ ref('supplier_share_access_rules') }}
WHERE UPPER(account_id) = CURRENT_ACCOUNT())
Step 5: Share the view
Use a post-hook to grant the share. Again, we used a macro here.
{{ config(
materialized='view'
secure=true,
post_hook="{{ share_view ('SHARING', 'SHARED_STORE_ORDER_ITEMS_VW', 'drizly_share' }}"
) }}
Again, this only runs in prod!
{% macro share_view(view_schema, view_name, share_name) %}
{% if target.name == "prod" %}
{% set sql %}
GRANT USAGE ON SCHEMA {{ target.database }}.{{ view_schema }} TO SHARE {{ share_name }};
GRANT SELECT ON TABLE {{ target.database }}.{{ view_schema}}.{{ view_name }} TO SHARE {{ share_name }}
{% endset %}
{% set table = run_query(sql) %}
{% endif %}
{% endmacro %}
Step 6: Relax
And that’s it! Next time you dbt run
in prod, the shares will be successfully created!
Once this is setup, adding a new supplier only takes a few minutes — we just add them to our dbt project and terraform config, and everything gets created as expected.
Future improvements
We’re pretty happy with this process so far. From here, the only thing we are going to tweak is setting up resource monitors within our Terraform config to help control cost, since reader account credits are charged to our account. If you’d like to follow the issue thats open with terraform regarding resource monitors, you can find that here.