The problem I’m having is that I’m trying to use the same model logic for multiple different environments. This would work great if each of the environments had the same underlying raw data, but they don’t.
For context, we have a multitude of clients data in our main production database (in Snowflake). We are using Fivetran to extract their data from the sources. Right now, we have one connector for each connector type pulling in all client accounts at once. For various reasons, we would like to split all of our clients data so they have their own database, which in turn means that each client has their own connector in Fivetran. Some clients take advantage of different data sources, some many, some few. While one client might have GA4, Meta, Reddit, Pinterest, LinkedIn, and others, another client might only have GA4 and Meta.
The way our dbt instance is currently set up, we are doing all the transformations with all of the different data types. We consolidate ads data into 1 Master Ads table. When trying with an individual client who has a subset of data types, we run into errors because the source tables don’t exist. I don’t however, want to create different dbt accounts for each client, as it would be easier to make “General” updates this way.
The goal here would be that each database (in this case environment) uses the same logic, but transforms different data. Where one database might not have raw Pinterest data, it would just skip that model all together instead of reporting that it doesn’t exist/isn’t authorized.
Not sure what the best course of action is here. If I need to elaborate more I’d be happy to!