The problem I’m having
We want to be able to test changes to the profiles.yml “threads” value in our dbt Project without having to code and deploy changes to our environments for each test. The simplest way to accomplish this would be to make the “threads” value a project variable.
The “old” profiles.yml had a hard-coded value of
threads: 16
However, when I set a project variable in dbt_project.yml, and write that variable as Jinja in my profiles.yml, my dbt compile still shows the old, hard-coded “threads” value.
The context of why I’m trying to do this
We have a tightly controlled deployment scheme and schedule, so I’ll have to go through multiple layers of tests and approvals to continue moving changes to the profiles.yml “threads” value into our testing environment. For example, I want to test 24 threads, then 25 threads, then 26 threads… this will take a few days of approval processing. I want to make this a project variable so I can run the tests from my client, and then pass only 1 set of approvals to get the correct thread count into our production environment.
What I’ve already tried
My dbt_project has the following:
# Name your project! Project names should contain only lowercase characters
# and underscores. A good package name should reflect your organization's
# name or the intended use of these models
name: 'qsi_dbt'
version: '1.0.0'
config-version: 2
# This setting configures which "profile" dbt uses for this project.
profile: 'qsi_snowflake'
# These configurations specify where dbt should look for different types of files.
# The `model-paths` config, for example, states that models in this project can be
# found in the "models/" directory. You probably won't need to change these!
model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]
docs-paths: ["docs"] # With this option configured, dbt will only look in the specified directory for docs blocks.
clean-targets: # directories to be removed by `dbt clean`
- "target"
- "dbt_packages"
# Configuring models
# Full documentation: https://docs.getdbt.com/docs/configuring-models
# In this example config, we tell dbt to build all models in the example/
# directory as views. These settings can be overridden in the individual model
# files using the `{{ config(...) }}` macro.
vars:
'cutoff_1': '09:00:00.000'
'cutoff_2': '14:00:00.000'
'cutoff_3': '18:00:00.000'
'hour_1': '14'
'hour_2': '18'
'buffer_in_mins': '30' #Used with DATEADD(MINUTE,-'{{{{ buffer_in_mins }}}}',(select max(LAST_COMMIT_TIME) from {{{{ this }}}})) in Models to avoid missing records
'il_buffer_in_hours': '20' #Used to determine freshness of IL records in RAW, using S3 timestamp
'thread_count': '24' #Used in profiles.yml to determine how many threads are used for dbt runs
models:
qsi_dbt:
# Config indicated by + and applies to all files under models/example/
#example:
# +materialized: view
on_schema_change: "sync_all_columns"
dw_cl:
schema: DW_CL
+materialized: table
seeds:
+quote_columns: false
qsi_dbt:
schema: DBT_WORKSPACE
My profiles.yml has the following
qsi_snowflake:
target: snowflake-db
outputs:
snowflake-db:
type: snowflake
account: "{{ env_var('DBT_SF_ACCOUNT') }}"
user: "{{ env_var('DBT_SF_USER') }}"
private_key_path: "{{ env_var('DBT_SF_KEY_PATH') }}"
role: "{{ env_var('DBT_SF_ROLE') }}"
database: "{{ env_var('DBT_SF_DATABASE') }}"
warehouse: "{{ env_var('DBT_SF_WAREHOUSE') }}"
schema: "{{ env_var('DBT_SF_SCHEMA') }}"
threads: "{{ var('thread_count') }}"
client_session_keep_alive: False
Some example code or error messages
When I run dbt compile for one of my models, I see the “old” hard-coded value of 16 threads and not my var default of 24:
(.data_eng_env) vscode ➜ /workspaces/snowconnect-dbt-gitlab/qsi_dbt (maroon/ANLTS-5483) $ dbt compile --models stage.carelogic_static.CF_FINAL
15:38:02 Running with dbt=1.7.2
15:38:03 Registered adapter: snowflake=1.7.0
15:38:03 Unable to do partial parsing because a project config has changed
15:38:25 Found 1954 models, 3 seeds, 2068 tests, 1949 sources, 0 exposures, 0 metrics, 547 macros, 0 groups, 0 semantic models
15:38:25
15:38:28 Concurrency: 16 threads (target=‘snowflake-db’)
Do I need to do something to reload the profiles.yml? Something to re-read the project variable into profiles.yml? Am I missing something else here?