Hi everyone,
I’m very new to the tool, so I am still making my head around some concepts. Apologies if there’s a rookie mistake.
We’ll have segment data coming from 3 platforms into Snowflake. The structure will look something like this:
segment_raw
- ios
- android
- web
My idea of modelling databases is as follows: raw → staging → reporting. Within those databases, there would be a schema for each platform (given there are multiple complex analysis for each platform). But also, there would be a schema for “overall” in staging and reporting. Reporting should only be seen by end-users. Therefore having:
segment_raw
- ios
- android
- web
staging
- ios
- android
- web
- overall
reporting
- ios
- android
- web
- overall
In my head, staging and reporting would be different folders inside “models” folder within dbt. Each schema would be a subfolder and any particular analysis would be another subfolder: i.e. “models/staging/overall/retention/monthly_retention.sql”
Questions:
1- How to use the “ref” function pointing to a particular model in another folder/schema?
2- Is this a good way to proceed with dbt?
3- How would the development structure look like? a clone copy of the staging database for each analyst?
4- What should the location of the different source.yml , schema.yml, etc be? I still don’t quite understand the logic behind some of these configuration files.