The problem I’m having
I’m trying to create a new table in our dbt/Databricks set-up. I (and others) have done this many times before, but this time we’re getting a weird error crop up.
I’ve written a sql definition for the table, and corresponding entries in schema.yml just as I would for any other table. They’re unexceptional and I’ve checked they don’t contain any hidden characters.
When I fire up dbt to create & populate the table, it fails with this error:
14:09:49 Database Error in model bnz_std_sqlbi01_user_transaction_file (models/bronze/bnz_std_sqlbi01_user_transaction_file.sql)
[PARSE_SYNTAX_ERROR] Syntax error at or near ''. SQLSTATE: 42601 (line 16, pos 6)
== SQL ==
/* {"app": "dbt", "dbt_version": "1.9.4", "dbt_databricks_version": "1.10.1", "databricks_sql_connector_version": "4.0.3", "profile_name": "datalakehouse", "target_name": "databricks_cluster", "node_id": "model.datalakehouse.bnz_std_sqlbi01_user_transaction_file"} */
create or replace table `dev`.`bronze`.`bnz_std_sqlbi01_user_transaction_file`
using delta
as
------^^^
SELECT
TRIM(CAST(transaction_originator AS STRING)) AS transaction_originator,
CAST(ab_number AS BIGINT) AS ab_number,
TRIM(CAST(alpha_name AS STRING)) AS alpha_name,
TRIM(CAST(emcu AS STRING)) AS emcu,
CASE WHEN LOWER(active_user) = 'true' THEN True ELSE False END AS active_user,
TRIM(CAST(location AS STRING)) AS location,
TRIM(CAST(status AS STRING)) AS status,
'sqlbi01' AS dlk_src_name,
_metadata.file_modification_time AS dlk_bnz_insert_timestamp,
CURRENT_TIMESTAMP() AS dlk_bnz_std_insert_timestamp
FROM `prod`.`bronze`.`bnz_raw_sqlbi01_user_transaction_file`
WHERE transaction_originator IS NOT NULL
compiled code at target/run/datalakehouse/models/bronze/bnz_std_sqlbi01_user_transaction_file.sql
14:09:49
What it’s complaining about at line 16 pos 6 (under the “as”) is a U+FEFF /  character that dbt seems to have inserted into its own script just so it can complain about it. How the heck do I remove it?
What I’ve already tried
I’ve checked and double-checked my source files using this tool to check for dodgy characters: View non-printable unicode characters
This is my sql file:
{% set key_columns = ['transaction_originator','dlk_bnz_insert_timestamp'] %}
{{
config(alias='bnz_std_sqlbi01_user_transaction_file',
materialized='table',
unique_key=key_columns,
file_format='delta'
)
}}
SELECT
TRIM(CAST(transaction_originator AS STRING)) AS transaction_originator,
CAST(ab_number AS BIGINT) AS ab_number,
TRIM(CAST(alpha_name AS STRING)) AS alpha_name,
TRIM(CAST(emcu AS STRING)) AS emcu,
CASE WHEN LOWER(active_user) = 'true' THEN True ELSE False END AS active_user,
TRIM(CAST(location AS STRING)) AS location,
TRIM(CAST(status AS STRING)) AS status,
'sqlbi01' AS dlk_src_name,
_metadata.file_modification_time AS dlk_bnz_insert_timestamp,
CURRENT_TIMESTAMP() AS dlk_bnz_std_insert_timestamp
FROM {{ source('bronze', 'bnz_raw_sqlbi01_user_transaction_file') }}
WHERE transaction_originator IS NOT NULL