How to structure different databases and schemas with more than one platform?

Hi everyone,

I’m very new to the tool, so I am still making my head around some concepts. Apologies if there’s a rookie mistake.

We’ll have segment data coming from 3 platforms into Snowflake. The structure will look something like this:
segment_raw

  • ios
  • android
  • web

My idea of modelling databases is as follows: raw → staging → reporting. Within those databases, there would be a schema for each platform (given there are multiple complex analysis for each platform). But also, there would be a schema for “overall” in staging and reporting. Reporting should only be seen by end-users. Therefore having:

segment_raw

  • ios
  • android
  • web

staging

  • ios
  • android
  • web
  • overall

reporting

  • ios
  • android
  • web
  • overall

In my head, staging and reporting would be different folders inside “models” folder within dbt. Each schema would be a subfolder and any particular analysis would be another subfolder: i.e. “models/staging/overall/retention/monthly_retention.sql”

Questions:
1- How to use the “ref” function pointing to a particular model in another folder/schema?
2- Is this a good way to proceed with dbt?
3- How would the development structure look like? a clone copy of the staging database for each analyst?
4- What should the location of the different source.yml , schema.yml, etc be? I still don’t quite understand the logic behind some of these configuration files.

Hi Joeen,
I’m going to take a stab at this. I think its less about your DBT project folders than your intended layout in the database. DBT will let you organise a project into complex folder structures yet have them all end up in a single schema. The thing to remember with a single schema is that all tables need a unique name. Therefore DBT insists that each model.sql file is unique.

Name your model files in a sensible but unique way and organise them in folders for clarity only. The folders are there for legibility but do not actually do much.
Use schema over rides to put these tables in different schema in the database if required.

Just curious what your final folder structure was like?

I’m also in the process of setting this up too for segment across 3 devices & 1 backend events too, using staging table to create first layer, but still planning the rest based on multiple github open source models on segment