Programmatic Generation of Many Many Models

I have this use case where I need to create over 100+ models with the exact same query and this needs to be done across many data sources. Here’s the options I have come up with so far but none of them seem to answer my needs 100%

  1. Create the models manually - this is just too painful
  2. Use something like dbt code-gen and dbt-generator - The main problem with this option is that it is not 100% automated and it requires developer intervention to run the generators and then make the outputs part of their work/commit
  3. Create the models automatically (100%) via custom macros that are run from our CI/CD - this option works like a charm; however, it breaks the dbt constructs as the database entities created (Snowflake views) are not added as resources/nodes into dbt and they are not reported as models (e.g. dbt docs won’t show them as model as dbt won’t know about these new views)
  4. Create the models automatically (100%) via a custom materialization - this option has the same issues as #3. It seems that by design dbt can only have one database entity (e.g. Snowflake view) per model, which means that only the model where the materialization is used is registered with dbt

We have run out of ideas here… any recommendations or is this a dead end for dbt?

BTW, we have already done our Google homework and we are aware of these posts:

Thanks in advance everyone

3 Likes

this needs to be done across many data sources

What are the ‘many data sources’ mentioned here? Do you have multiple databases - something like one per account/client?

If that is the case, you could use environment variable for the database in your dbt profiles.yml.

With this done, you could invoke dbt run multiple times with each database set as the environment variable. This can be done in parallel in multiple shells.