Stop schemas from concatenating

ab2374923 · August 28, 2024, 9:43am

I am using dbt core 1.5.4 for BigQuery.
Apparently it is mandatory to define a (default) schema in the profiles.yml file - in this example I called it “main” > see below.
I have several folders within the models folder and I want to freely define schema (BigQuery destination datasets) for each subfolder regardless of what is defined as default schema in profiles.yml
From all that I read I need to define a schema for each subfolder in the dbt_project.yml file which I attempted below opting for “models_1” as schema name that should be applied to all models in “models_subfolder_1”
I expect dbt to write all models from this subfolder into a dataset named “models_1” instead it creates a dataset named “main_models_1” meaning it concatenates the default schema in profiles.yml with the schema explicitly defined for each subfolder. How can I stop dbt from doing this?

profiles.yml

my_dbt_project: 
  target: "{{ env_var('DBT_ENVIRONMENT', 'dev_local') }}"
  outputs:
    dev_local:
      schema: main
      location: EU
      type: bigquery
      method: oauth
      project: my_bigquery_project
      threads: 4
      timeout_seconds: 300

dbt_project.yml

name: 'main'
version: '1.0.0'
profile: 'main'

models:
  main:
    models_subfolder_1:
      +materialized: table
      +schema: models_1

ab2374923 · August 28, 2024, 10:41am

ok, I found the solution with the help of ChatGPT. Apparently dbt has an internal macro called generate_schema_name that is responsible for how the schema name is generated. It exists under the hood meaning it is not visible in the macros folder. You can overwrite (= create a macro with the same name in the macros folder of you project) as shown below. This will make sure that the schema from profiles.yml is ignored and only apply whatever is configured in dbt_project.yml. So this works but does it really need to be so difficult? Still curious to hear if others had the same problem and how they solved it …

{% macro generate_schema_name(custom_schema_name, node) %}
    {{ custom_schema_name }}
{% endmacro %}

Fernando · September 8, 2024, 10:32pm

Found the same solution here: https://www.youtube.com/watch?v=AvrVQr5FHwk

I went through the same situation with dbt CLoud.

dbt has created my Dataset in BigQuery and it was not working because of some region configuration.
I tried to use another dataset but the schema in dbt_profile.yml was being concatenated with my profile schema.

I tried your solution and it worked, but once the macro is executed once it overwrite the IDE original macro and setting the schema in the dbt_profile.yml becomes mandatory

I understood that in Development mode dbt Cloud force to use the profile dataset which makes sense.

ab2374923 · September 13, 2024, 8:54am

Damn - didn’t find this in the documentation! Anyway, good to know that that’s the way it’s done - thx for posting!

Topic		Replies	Views
Writing to a default schema Help bigquery , dbt-core	3	735	May 29, 2024
From where does DBT (BigQuery) take project and dataset while running the model. Help variables	6	4529	July 10, 2023
schema name is not correct Help	4	1370	March 2, 2023
Custom Versions of generate_schema_name Show and Tell jinja , environments	0	4884	September 8, 2020
How to overwrite schema set in `profile.yml`? Archive	4	8582	December 1, 2021

Stop schemas from concatenating

Related topics