The problem I’m having
From my local laptop virtualenv (intelliJ IDE) I am using dbt-databricks adaptor and pyproject.toml file with poetry for example:
poetry lock --no-update
poetry install
The version of Python I am using is 3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)] on win32
.\dbt\macros\spark__location_clause.sql
I have a custom override macro spark__location_clause.sql which has the hardcoded specific s3 bucket for dev/preprod/prod environments/workspaces and each model’s config() has a relevant location_root define like:
{{
config(
materialized='incremental',
unique_key='id',
schema='silver_curated',
location_root='silver/curated/data',
databricks_compute='Compute1'
)
}}
which explicitly defines the location_root part for the spark__location_clause.sql macro to use.
This works as expected with dbt-databricks 1.7.16 dbt-core 1.7.15.
The custom .dbt\macros\spark__location_clause.sql macro jinja templating line:
location 's3://mydevbucketname/{{ location_root }}{{ cust_schema }}/{{ identifier }}'
is rendered during dbt compile to
location 's3://mydevbucketname/silver/curated/data/myusername/mytablename'
[tool.poetry.dependencies]
dbt-databricks="^1.7.10"
dbt-core = "1.7.15"
apache-airflow-providers-databricks = "3.3.0"
dbt --version
Core:
- installed: 1.7.15
- latest: 1.8.3 - Update available!
Your version of dbt-core is out of date!
You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation
Plugins:
- databricks: 1.7.16 - Update available!
- spark: 1.7.1 - Update available!
At least one plugin is out of date or incompatible with dbt-core.
You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation
dbt-adapters 1.3.2
apache-airflow-providers-databricks 2.9.6
databricks-sql-connector 3.3.0
However, when I upgrade to latest version of dbt-databricks dbt-core 1.8.x the .\dbt\macros\spark__location_clause.sql
macro is no longer being called and the table creation errors with an invalid External storage location uri.
I wish to keep up-to-date with latest version of dbt-databricks dbt-core and also continue to create external type (not managed) databricks delta tables.
install dbt 1.8.x (no changes to my working models or the working macros)
[tool.poetry.dependencies]
dbt-databricks = "^1.8.0"
dbt-core = "^1.7.15"
apache-airflow-providers-databricks = "^2.0.0"
dbt --version
Core:
- installed: 1.8.3
- latest: 1.8.3 - Up to date!
Plugins:
- databricks: 1.8.3 - Up to date!
- spark: 1.8.0 - Up to date!
dbt-adapters 1.2.1
apache-airflow-providers-databricks 2.2.0
databricks-sql-connector 3.1.2
is rendered during dbt compile to
location 'silver/curated/data/mytablename'
looks like ignored the custom override macro .dbt\macros\spark__location_clause.sql and ran the default dbt_spark.spark__location_clause.sql macro
{% macro spark__location_clause() %}
{%- set location_root = config.get('location_root', validator=validation.any[basestring]) -%}
{%- set identifier = model['alias'] -%}
{%- if location_root is not none %}
location '{{ location_root }}/{{ identifier }}'
{%- endif %}
{%- endmacro -%}
Error message using dbt run
e[0m10:15:19.423283 [error] [MainThread]: Runtime Error in model mytablename (models\silver\silver_curated\mytablename.sql)
[RequestId=7ade644b-67ad-4b17-8829-6642f4595c8b ErrorClass=INVALID_PARAMETER_VALUE] GenerateTemporaryPathCredential uri silver/curated/data/mytablename is not a valid URI. Error message: INVALID_PARAMETER_VALUE: Missing cloud file system scheme.
e[0m10:15:19.424923 [info ] [MainThread]: