Parse Redshift incremental model with unique_key to Databricks

The problem I’m having

Hey all. I’m working to parse some Redshift models to run with Databricks(Delta format). The current configuration of the models is:

Redshift:

config:
    materialized: incremental
    unique_key: date

For Databricks, I can use:

config: 
    materialized: incremental
    unique_key: date
    incremental_strategy: merge

However, for some of these models, the unique_key column is not unique. It works fine as the Redshift adapter performs a delete + insert. However, the Databricks adapter performs a merge operation; therefore, I cannot use this configuration for these tables.

I’m wondering if there’s a way to perform the delete as part of the dbt run, before the merge. Or if I should go with Creating new materializations | dbt Developer Hub

I appreciate any thoughts. Thank you.

Is there any way to make your unique_key column unique? The documentation is pretty firm that your unique keys need to be unique, so if that’s not the case you’ll be in for a bad time as you’re discovering.

Is this intentional, or is it a hangover from a past modelling error? If the latter, is it possible to do a one-off cleanup outside of dbt (or inside, using something like a row_number and filtering to just the first version)?

Thank you for your answer @joellabes.

Is there any way to make your unique_key column unique?

Is this intentional, or is it a hangover from a past modelling error?

I don’t think it’s possible to make it unique. This was intentional.

I will try to explain a common use case. Let’s say we have the incremental model and the current state of the table is:

key value
1 ‘a1’
1 ‘a2’
2 ‘b1’
2 ‘b2’

And the new selected data is:

key value
2 ‘b3’

The expected table’s state should be:

key value
1 ‘a1’
1 ‘a2’
2 ‘b3’

I’m wondering if there’s another strategy to do the same.

I checked the docs and it looks like delete+insert is supposed to do this (?)

Even if we could find an alternative strategy for Redshift, the Databricks adapter doesn’t have an incremental_strategy that deletes all the keys before adding the new values.

I’m now exploring https://docs.getdbt.com/guides/advanced/creating-new-materializations