dbt snapshot ambiguous column reference

The problem I’m having

dbt snapshot fails on subsequent runs as a result of an ambiguous column error in the source code

FYI This is using dbt-databricks. My snapshot block looks like this:

{% snapshot raw_reviews_snapshot %}

{{
    config(
      target_schema='airbnb_poc',
      strategy='check',
      unique_key = 'ID',
      check_cols=['listing_id', 'date', 'comments', 'reviewer_name', 'reviewer_id']
    )
}}

SELECT 
    ID,
    listing_id,
    date,
    reviewer_id,
    reviewer_name,
    comments
FROM {{ source('airbnb_source','raw_reviews') }}
WHERE ID IS NOT NULL

{% endsnapshot %}

Hi Jeremy,

Did you get an answer on this? I am facing the same problem with snapshotting data in dbt

Thanks.

Nick

Hi, at least in my case , this is because I’ve schema evolution enabled. If I don’t this is not a problem.

TLDR:

dbt will insert a dbt_unique_key column at their 2nd run of the model. You can notice this in the dbt.log file.

You will see how at the first run of an snapshot, dbt doesn’t use the dbt_unique_key column. But in the following snapshot runs it does. And due to the schema evolution enabled this causes the issue.

One workaround solution to support schema evolution in snapshot models is to set the following query in the post_hook section “ALTER TABLE {{ this }} DROP COLUMN IF EXISTS dbt_unique_key”