What is best practices for using DBT and Git with multiple internal organizations.

msavage0507 · June 24, 2024, 6:37pm

When companies that have many internal organizations use dbt. Do they all work in the same repository so that they can keep all of the linage data. Or is there a way to work in separate repositories but still keep this lineage.

Miroslaw · June 25, 2024, 10:51am

I would use each dbt run per project repository. Each repository you can define private and assign a group and roles individually. That way you can have data governance and assign product owner to each repository. This is kind of data mesh structure I guess. Also you can read about Data Vault concept.

mmarcelo · July 1, 2024, 12:46pm

I’d like to hear more about this question. Let’s suppose we have different teams taking care of following business areas: sales and payroll. Adding to scenario, we might want to share master_data between teams.

Now, let me expose my thouthgs:

create 3 environments named: prod_sales, prod_payroll, and prod_master_data. Each team will be able to run and test on their environment. Additionaly, each environ can be linked to a different branch on Git;
on database side, create 3 different schemas with read/write permission for each team. This will prevent someone on different team running/building something on someone else’s schema;
on Git side we could have 3 different branches, but I don’t know how to effectively segregate team access without adding complexity on Git

More on git side: Having different branches won’t prevent someone creating objects related to payroll and merging onto sales branch, but having isolated repos do. As side effect, it makes difficult to share master_data between repos. I think that creating a rule on Git to only allow objecs on a given subfolder on each branch might work. Ex: only objects on subfolder ./sales would be commited on sales branch. Well, I’m not an expert on Git and not sure if this is possible.

Any thoughts ?

zoyaadeel · July 11, 2024, 9:57am

Hello,
I think all lineage data is kept together and easily accessible, providing a comprehensive view of data transformations and dependencies across the entire organization. Teams work in separate repositories but still maintain lineage data. This can be achieved by using dbt’s external dependencies feature, where each team’s repository includes dependencies on the models of other teams.

msavage0507 · July 11, 2024, 3:10pm

Thanks I am going to look into that.

Topic		Replies	Views
Should I have an organisation wide project (a monorepo) or should each work flow have their own? In-Depth Discussions project-structure , best-practice	2	32458	May 4, 2021
How to configure your dbt repository (one or many)? In-Depth Discussions	11	53959	December 31, 2021
Managing multi-project lineage in a data mesh setup with dbt core Help best-practice , bigquery , dbt-core	5	643	November 19, 2024
What is the best practise of git branching in dbt cloud In-Depth Discussions testing , best-practice , ci-cd , snowflake , dbt-cloud	1	166	June 26, 2025
Getting Started with git Branching Strategies and dbt In-Depth Discussions devblog	6	254	April 17, 2025

What is best practices for using DBT and Git with multiple internal organizations.

Related topics