Inmon vs Kimball in dbt

ddumas · June 20, 2022, 3:38pm

Traditionally in Inmon-style data warehouse design, there is an intermediate layer between the raw “data lake” and mart (dimensional) layers. This is either accomplished with data vault (difficult), or a normalized design (easy).

Then marts are built on top of these. The idea is that is if you design your marts wrong (stovepipe designs), you have nothing to fall back on and must re-do them. Additionally, and more importantly, the intermediate layer accurately reflects you business processes.

With dbt, so far, most of what I am seeing is source (data lake) ==>stage (light transform)==>mart (fully transformed) data designs.

Why is this? I.E. Is there something about dbt that somehow eliminates the need for a middle layer? I am new to dbt, but the concept is still the same.

terry_joyce · June 23, 2022, 5:56pm

At Trove, we’ve very much built an Inmon-style data warehouse. dbt works very well for this.

To roughly characterize our model, there are core tables in the business-entity layer: a normalized representation of the business and related data entities, just as you suggested. Then there are summaries, aggregations, snapshots used in data marts. When using visualization tools like Looker, it’s more efficient to use the data marts. The core model is often the target of exploratory data analysis.

The core models are rebuilt in one dbt run each night. The marts are built in a separate run executed later. dbt would equally support building them all in a single run: the built-in dependency resolution would mean that the core models would be built first and the marts after.

Throughout the business day, the core models are updated incrementally every 10 minutes with fresh source data. Depending on how they’re used, the data marts may or may not be incremental models.

ddumas · June 23, 2022, 10:03pm

Terry,

You have communicated that perfectly. I was worried that everyone in the dbt world was skipping that business layer. Do you have any thoughts on using Data Vault as a business layer? I was at a large company where they had a Data Vault business layer. For me it seemed overly complex, but I am no data vault expert.

Thanks,

Dave

SignalRunner · July 5, 2022, 10:41pm

Hi ddumas,
I have implemented DV at a large organization, and you are right it can be insanely complex, and would potentially be over engineering for a large number of scenarios.
But I will say this - it’s a high appropriate solution that pays massive dividends if :

If your organisation has many disparate and changing source data sets, but at the same time a reasonably stable , and logical, business entity model.
You have a DW automation tool to automatically build out assets and pipelines based on the vault model (like BIMLFlex)

Other lesser reasons besides. Also building data marts from a good data vault is trivial, and more often that not can be entirely virtualized in views.

Doing it in DBT I don’t know - but probably wouldn’t try it without DW automation.

Topic		Replies	Views
How do you structure your marts & database schemas? Archive	2	4766	November 15, 2021
How to balance the need for controlled development of core models with rapid development of peripheral models In-Depth Discussions data-quality , best-practice , orchestration-and-deployment , developer-ergonomics	6	4817	June 13, 2018
Is Kimball dimensional modeling still relevant in a modern data warehouse? In-Depth Discussions kimball	17	47589	June 23, 2022
Best practice splitting SQL into staging, intermediate and marts layers and naming conventions Help	1	5589	January 3, 2024
Repeated CTEs, DRY and intermediate models good practice In-Depth Discussions project-structure , best-practice	9	5240	May 4, 2023

Inmon vs Kimball in dbt

Related topics