Writing queries in a BI tool vs modeling them in dbt

mhmazur · June 21, 2018, 5:48pm

I recently asked a question in Slack about best practices around modeling and would like to continue that discussion here because there are still some things I’m still unclear about.

To take a real-world example, I’m using Mode as a BI tool and I want to create a stacked bar chart showing how many recent upgrades there were by day for a certain segment of the users broken down by plan.

Here’s what the query looks like if I write it directly in Mode:

select
  upgrade_date,
  plans.name as plan_name,
  count(*) as upgrades
from analytics.users 
join analytics.plans using (plan_id)
where
  upgrade_date >= current_date - interval '30 days' and
  vertical = 'work'
group by 1, 2

Alternatively, I can model this using dbt, something like this:

with recent_work_users as (

	select *
	from {{ ref('users') }}
	where
		vertical = 'work' and
		upgrade_date >= current_date - interval '30 days'

),

final as (

	select
	  upgrade_date,
	  plans.name as plan_name,
	  count(*) as upgrades
	from recent_work_users
	join analytics.plans using (plan_id)
	group by 1, 2

)

select * from final

And then in Mode simply run:

select * from analytics.recent_work_upgrades_by_plan

Which approach makes more sense - writing the query directly in Mode or modeling it in dbt? I could be wrong, but I feel like this isn’t what I should be modeling in dbt, but writing the full query in Mode doesn’t seem ideal either because it won’t be version controlled, etc.

It may be that analyses are the way to go, but I’m unclear about how to fit them into my workflow. For example, I typically will draft a query, run it, then repeat a few times until things are just right. Would I do that in Mode, then paste the final query into a new analysis file, replacing the view names with dbt references & converting it to use CTEs, compile it, then paste the compiled query back into Mode? I’m guessing there’s a better way :).

It seems like there are a lot of ways this can be done - which approach would you all recommend?

tristan · June 21, 2018, 6:24pm

The key here is interactivity. dbt is not interactive–it’s a batch-based tool, so all decisions are made at coding-time, not analysis-time. It’s very common that a user might want to change the interval '30 days' in your query above (if that’s not relevant in your particular use case just imagine one of the many use cases where such a thing might be configurable). If you build this as a model in dbt, you prevent the user from interactively looking at the data.

The way we use dbt is to “provide the best version of all nouns” to the BI layer. If you have an object called plans, dbt’s job is to provide the BI layer with the cleanest possible definition of a plan. That way, you can keep your queries in the BI layer clean and focused on actual analysis, not data munging!

mhmazur · June 21, 2018, 7:45pm

Thank you for this.

My challenge with Mode (vs a tool like Looker) is that it’s not very interactive. For example, I could be wrong, but I don’t think there’s a way to use the UI to say “show me the last 30 days”. You can select a range of dates, but not an duration of time relative to today. So If I want a chart showing upgrades over the last 30 days or for just this month or whatever, it has to be in a query. And because there’s no modeling layer, all of the joins and whatnot have to happen in a query as well.

It sounds like you would recommend just throwing my original query above in Mode and not having it in dbt or version control at all, correct?

Can you elaborate on how you use dbt analyses in your work? For example, if it doesn’t make sense to use analyses for something like this, how else are dbt analyses used in practice?

tristan · June 21, 2018, 7:59pm

You absolutely can build parameters in your Mode reports! This is extremely powerful and we do it a lot:

In some ways Looker’s parameters are more powerful but in others Mode’s are much more powerful. This is specifically due to the fact that you can program anything in a mode report using Liquid. We’ve used this to do some pretty impressive stuff in the past and it allows for quite a bit of interactivity with the user. The answer to your short question of “should I just put all of this stuff in Mode” is, IMO, “yes”

We don’t actively make use of the “analysis” folder in dbt in any meaningful way. At some point we’d love to more directly integrate dbt’s compilation of jinja with other tools (like Mode) but we’re not there. This feature is a bit of a stub at the moment…use at your own risk.

mhmazur · June 21, 2018, 8:21pm

I’m on Mode’s free plan so apparently no go on using parameters for now, but that’s very good to know about. Thank you for the info on analyses as well.

Topic		Replies	Views
How we (used to) structure our dbt projects Archive	22	196831	September 7, 2022
How Could DBT Models be Customised for Improved Maintenance and Achievement? Help best-practice , snowflake , dbt-cloud , dbt-core	0	221	June 18, 2024
How to add observability to your dbt deployment Show and Tell	2	12504	May 20, 2023
Seeking Advice on Streamlining Data Models in dbt Help	1	127	August 3, 2024
Trigger the dbt model run from ThoughtSpot In-Depth Discussions	0	814	September 4, 2023

Writing queries in a BI tool vs modeling them in dbt

Related topics