How do you structure your marts & database schemas?

Hey all,

So I’m pretty new to all of this, over the past 2 years I have been transitioning my company into a modern data stack from an environment of a single erp database and canned crystal reports. For the better half of that two years, we’ve brought in our ERP data (two different systems) to our data warehouse (Snowflake) and have been developing views in the raw database to support analytics. This was working for the most part but we quickly scaled out of it and now I’m pushing for dbt. I’ve based our implementation on the best practices documentation found here and in the official dbt docs, but there are a few things I am struggling to understand.

Assuming the structure detailed in the best practices, marts are structured based on department the data belongs to - sales, marketing, operations, etc. With this in mind, I would assume that these marts are not 1:1 with the schema structure of the analytics database, is that correct? The structure I had in mind and wanted to do was have one schema per model, wouldn’t it be wise to mirror that schema structure in the marts? So sales for example would become several marts - sales_orders, sales_quotes, sales_opportunities, etc and they each go into their own schema. How are other people handling this? Should I not look to create one model per schema? I started doing multiple models in a schema, but the problem was it did not become logical just by looking at the db that certain dimensions go with certain facts (until you queried the tables and looked at the keys) and I feel like the method of one model per schema would be self documenting and would help analysts that do not spend all day in the database or might not be as experienced using SQL.

Sorry if this is sort of a newbie question, I was kind of thrown in on the deep end and trying to learn as I go.



Kindly check out this great Discourse made by @claire which elucidate some best practices on dbt project structuring.

I personally used it to help build a modern Data stack for my organization.

FWIW, here’s our current dbt project structure.

We split our models into datamarts (for downstream users), domains (which contain dimensionally modeled models split by subject domains), and legacy (which is not dimensionally modeled).

Then we have subfolders common (for CTEs), business (for dimensional models), and denormalized (for denormalized views).

This doesn’t follow the official dbt recommendations at all, but it works for us. We’ll probably add a staging layer down the line too (models/staging).