Redshift intermittent error "could not complete because of conflict with concurrent transaction"

The problem I’m seeing

We’re using dbt cloud, with a Redshift Serverless warehouse. We have been running scheduled dbt jobs with 4 threads for months without an issue.

Starting December 19, 2025 (which is the release date of dbt core v1.11), we started having intermittent dbt job failures, with the error:

could not complete because of conflict with concurrent transaction

This always happens on the same model (dim_country), which is among the first 4 models to be processed by our 4-thread job.

We have ruled out external conflicts (with other jobs, processes or manual queries) - this errors occurs when nothing else is using this table. The only queries running in parallel are the models processed by the other threads of the same job.

When reviewing the debug logs, we see that the failed statement is always the following:

drop table if exists "dwh"."marts"."dim_country__dbt_backup" cascade;

The jobs are always scheduled on a round hour (6am, 6pm). We have tried changing to 3am but the issue persists.

When manually re-running the failed jobs, they always complete successfully.

My question

Can anyone think of the root cause for this issue? Also, is it a coincidence that it started on the release date of the latest major dbt version?

Would appreciate any help.

Update January 12, 2026

Quickly updating that once I’ve updated my my dbt cloud settings to work with the Compatible dbt version instead of the Latest version, the issue seemed to have stopped.

When using the Latest, the logs showed dbt version in use was 2025.12.20, which I believe maps to dbt 1.11.1 or 1.11.2 - that’s when we saw the issue. When using Compatible, logs showed version 2025.12.19, which I believe maps to dbt 1.11.0.

Maybe that could be a hint to what may be causing the issue?

Thanks

I’m also encountering this problem. We’re also using dbt cloud with a Redshift Serverless cluster, and the problem also started appearing on 2025-12-19. Our pipeline has a much larger number of models and to date, it’s always been a different model that’s trying to access a table that’s locked up by some other process. This issue has occurred during scheduled runs, in MR CI/CD pipeline builds (which create a MR-specific schema name for the build), and in manual runs, and we typically “resolve” the issue by just running the pipeline again (without changing any code).

I assumed someone in my org had started running some process that manually backs up tables (for some reason), but given that your issue also appeared the same day, I suspect there might be some other change at play.

Hey @mattt , I managed to resolve this by adding the retry_all: true extended attribute in dbt cloud Environment settings. This forces Redshift to retry statements.

Another thing that helped, even if temporarily, was to change the dbt version setting from Latest to Compatible on our Production dbt environment.

1 Like

Thanks @amitt ! I adopted the switch from Latest to Compatible yesterday and while we have only had a few runs, we haven’t observed the concurrent transaction failure mode yet.

I’ll look into adding the retry_all: true config if the problem recurs. In any case, thank you so much for posting the problem, your findings, and your fixes!

In summary -
We are no longer experiencing this issue, after taking the following steps:

  1. Added a retry_all: true attribute to our prod dbt (cloud) environment

  2. Contacted AWS in this regard, and were assured that they took steps to mitigate, see their response below:

We have investigated this issue and identified it as an internal catalog conflict. This error occurs when background system maintenance operations run concurrently with your DROP TABLE commands, causing a timing conflict when both attempt to update the same internal system records. Our engineering team is actively investigating the root cause.

To mitigate this issue, we have enabled two optimizations on your cluster. First, we have enabled extended diagnostic logging to capture additional details if the issue recurs, which will help our engineering team further analyze the root cause. Second, we have adjusted an internal timing parameter to reduce the probability of conflicts between DROP TABLE operations and background catalog queries.

A one-time cluster pause/resume is required to activate these configuration changes. Please schedule the same at your earliest convenience during a suitable maintenance window.

In our case this is Redshift Serverless, but I got it from other people that AWS applied similar approach to customers with Redshift clusters who experienced this issue.

Feel free to refer to our AWS support case (176796275100419) if/when you’re reaching out to AWS.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.