The problem I’m having
Hey all. I’m working to parse some Redshift models to run with Databricks(Delta format). The current configuration of the models is:
Redshift:
config:
materialized: incremental
unique_key: date
For Databricks, I can use:
config:
materialized: incremental
unique_key: date
incremental_strategy: merge
However, for some of these models, the unique_key
column is not unique. It works fine as the Redshift adapter performs a delete + insert. However, the Databricks adapter performs a merge
operation; therefore, I cannot use this configuration for these tables.
I’m wondering if there’s a way to perform the delete as part of the dbt run
, before the merge. Or if I should go with Creating new materializations | dbt Developer Hub
I appreciate any thoughts. Thank you.
Is there any way to make your unique_key
column unique? The documentation is pretty firm that your unique keys need to be unique, so if that’s not the case you’ll be in for a bad time as you’re discovering.
Is this intentional, or is it a hangover from a past modelling error? If the latter, is it possible to do a one-off cleanup outside of dbt (or inside, using something like a row_number and filtering to just the first version)?
Thank you for your answer @joellabes.
Is there any way to make your unique_key
column unique?
Is this intentional, or is it a hangover from a past modelling error?
I don’t think it’s possible to make it unique. This was intentional.
I will try to explain a common use case. Let’s say we have the incremental model and the current state of the table is:
key |
value |
1 |
‘a1’ |
1 |
‘a2’ |
2 |
‘b1’ |
2 |
‘b2’ |
And the new selected data is:
The expected table’s state should be:
key |
value |
1 |
‘a1’ |
1 |
‘a2’ |
2 |
‘b3’ |
I’m wondering if there’s another strategy to do the same.
I checked the docs and it looks like delete+insert
is supposed to do this (?)
Even if we could find an alternative strategy for Redshift, the Databricks adapter doesn’t have an incremental_strategy that deletes all the keys before adding the new values.
I’m now exploring Creating new materializations | dbt Developer Hub