- I am using dbt core 1.5.4 for BigQuery.
- Apparently it is mandatory to define a (default) schema in the profiles.yml file - in this example I called it “main” > see below.
- I have several folders within the models folder and I want to freely define schema (BigQuery destination datasets) for each subfolder regardless of what is defined as default schema in profiles.yml
- From all that I read I need to define a schema for each subfolder in the dbt_project.yml file which I attempted below opting for “models_1” as schema name that should be applied to all models in “models_subfolder_1”
- I expect dbt to write all models from this subfolder into a dataset named “models_1” instead it creates a dataset named “main_models_1” meaning it concatenates the default schema in profiles.yml with the schema explicitly defined for each subfolder. How can I stop dbt from doing this?
profiles.yml
my_dbt_project:
target: "{{ env_var('DBT_ENVIRONMENT', 'dev_local') }}"
outputs:
dev_local:
schema: main
location: EU
type: bigquery
method: oauth
project: my_bigquery_project
threads: 4
timeout_seconds: 300
dbt_project.yml
name: 'main'
version: '1.0.0'
profile: 'main'
models:
main:
models_subfolder_1:
+materialized: table
+schema: models_1
ok, I found the solution with the help of ChatGPT. Apparently dbt has an internal macro called generate_schema_name
that is responsible for how the schema name is generated. It exists under the hood meaning it is not visible in the macros folder. You can overwrite (= create a macro with the same name in the macros folder of you project) as shown below. This will make sure that the schema from profiles.yml is ignored and only apply whatever is configured in dbt_project.yml. So this works but does it really need to be so difficult? Still curious to hear if others had the same problem and how they solved it …
{% macro generate_schema_name(custom_schema_name, node) %}
{{ custom_schema_name }}
{% endmacro %}
Found the same solution here: https://www.youtube.com/watch?v=AvrVQr5FHwk
I went through the same situation with dbt CLoud.
dbt has created my Dataset in BigQuery and it was not working because of some region configuration.
I tried to use another dataset but the schema in dbt_profile.yml was being concatenated with my profile schema.
I tried your solution and it worked, but once the macro is executed once it overwrite the IDE original macro and setting the schema in the dbt_profile.yml becomes mandatory
I understood that in Development mode dbt Cloud force to use the profile dataset which makes sense.
1 Like
Damn - didn’t find this in the documentation! Anyway, good to know that that’s the way it’s done - thx for posting!