@arvindr – I would love to see that script! You should totally consider open sourcing it
@JoeNaso – re: your original question:
First of all, I have been in this situation more times than I wish to admit. The first thing that I recommend is creating a culture that helps you debug “broken” or “wrong” dashboards quickly, to help you understand what exactly has changed! We have a guide on reporting bugs as well as an issue template that we share with any BI users that are reporting issues.
^That’s obviously a reactive strategy, and long-term it’s better to be proactive, so my $0.02AUD on this:
I think there’s two types of changes that can impact a dashboard:
- The data model that the dashboard selects from has changed
- The dashboard structure has changed
These are pretty distinct problems, so I’ll address them separately.
1. The data model that the dashboard selects from has changed
From a process perspective, here are some suggestions:
- Implement a PR template that prompts dbt users to indicate if any existing models are changing – you can see our standard template here. This small change can have big impacts.
- Check the “before and after” of any major dashboards – no matter which BI tool we are using, we find a way to parameterize the schema we select from, so that we can easily do this.
- Write some audit macros to validate changes! I have some macros that compare columns across two relations, and often use a version of the equality macro to do record-by-record validation. Pro-tip: write the audit query before you make your changes; you’ll thank yourself at the end . Also check out dbt-helper’s
compare command – I’m pretty sure you can use this to detect model name changes, and it can probably be extended to also compare column changes (I’m sure @michael-kaminsky would accept a PR!).
- If your data model changes affect an end user, they should be involved in the PR process as a reviewer!
- If your BI tool has a git integration, every dbt PR should have a complementary PR on the repo containing your BI code (or you could even have them both in the same repo! I haven’t done this, but it’s not a terrible idea!)
2. The dashboard structure has changed
i.e. the fields the dashboard selects, the visualizations it uses, or the filters it applies.
The answer to this one ultimately depends on the BI tool you’re using, and I think the above responses do a really good job at covering this, so I won’t repeat what they’ve said.
One point I’d like to make here is that sometimes dashboards change/mistakes creep in because the data model an analyst is querying isn’t intuitive/well documented. Strong data model design, and good documentation can go a long way on this front, which of course, dbt helps a lot with.
A few extra points if you do use Looker:
- Looker offers LookML dashboards that are written in code, so can also be version controlled. I’ve seen some orgs have “Source of Truth” dashboards in LookML, and all non-Source of Truth dashboards as user-defined dashboards
- Alternatively, you can use Looker’s permissions to ensure that only authorized users are allowed to edit a user-defined dashboard.
- I find that Looker can abstract away the git workflow too much – as an analyst whose first introduction to git was via Looker, I definitely learned bad git habits. I think it’s worth spending the time to go through basic git workflows with your analysts (especially if you want to include them in the PR process, as outlined above), and (as you might expect) we’ve open sourced our git guide here.