Testing dbt changes in downstream BI tools

dylanbaker · January 10, 2019, 3:57pm

Does anyone do any testing in downstream BI tools when pushing changes to dbt? My use case would be for Looker specifically. Occasionally someone may think it’s safe to delete or rename a column, and a dimension or measure may stop working in Looker.

An idea would be to have something that tries to query important explores via the API, selecting all columns, and maybe limiting to 1 row for performance. In Looker’s case, I think you would also have to tell Looker to look at a staging schema where dbt has been run to reflect the changes, and also possibly tell it to run on a development branch in case you’ve made corresponding changes in Looker you want to test simultaneously?

I’d be interested in things that people might be doing in this area, and thoughts generally around how people think something could be implemented.

tmurphy · January 10, 2019, 4:23pm

We’re using user attributes to point to a user’s scratch schema while in dev mode. (See https://gitlab.com/gitlab-data/looker/blob/master/pipe2spend_consolidated.view.lkml#L3-6 ) The proposed dbt will get run and point to the scratch schema. We’ll toggle back and forth between dev and not to see what effects it has on different explores. Sometimes we’ll be doubly logged in with one and dev and one not in dev to see the changes side by side (lots of screenshots).

We’ve also used gazer (https://discourse.looker.com/t/gazer-a-command-line-tool-for-looker-content-management/8066) to output all the looks and dashboards to see if we’re missing anything. In our migration to Snowflake we had all of the looks and dashboards in Airtable with screenshots so we could validate changes.

All that to see, it’s not easy and none of this is ideal in my opinion.

michael.dunn · January 10, 2019, 7:00pm

We have two separate DW instances (in BigQuery, these are “projects”) and two Sinter projects. The develop branch of our repo is consumed by one Sinter job, and loads the dev DW. The master branch is consumed by the other Sinter job, and load the prod DW.

We’ve used Looker’s -- if dev and -- if prod feature along with user attributes to have Looker default to using the dev DW when in dev mode, and the prod DW when not in dev mode.

We’re also set up so that when the DBT target is anything other than prod, instead of models being built in stg, intermediate, analytics, and metrics schemas, they’re loaded into a single dev_<username> schema. This is also configured congruently in user attributes in Looker.

By changing the user attributes, we can make production looks point to dev data, we can point dev explores to prod data, or use the default prod to prod, dev to dev alignment.

With this approach, Looker’s content validator is our friend.

Topic		Replies	Views
Strategy for dbt modeling when using Looker Archive	2	5193	June 3, 2021
Looker user attributes and `if dev` to auto-switch between dev & prod schemas Show and Tell environments , looker	11	9774	November 8, 2020
Faster data modeling with Looker and dbt (Webinar Q&A with Drizly) Archive	0	3421	October 5, 2020
Best practices to enforce models without failing assertions Archive	1	2495	May 20, 2022
Best Practices on Deploying BI/ Reporting Tools In-Depth Discussions best-practice , bi , tools	3	5440	May 7, 2019

Testing dbt changes in downstream BI tools

Related topics