Issue with Post Hooks Not Running With DBT 0.16.1

I’ve noticed an issue yesterday and today where my post hooks which create my indexes are no longer running after the end of my model.

post_hook=[
‘CREATE UNIQUE INDEX IF NOT EXISTS UI_{{ this.name }}_pk_id ON {{ this }} (“pk_id”)’
]

Did something change recently where this syntax is no longer working like it did for prior versions?

Hi @chanwd
Sounds like you might be experiencing a bug. To help us understand this issue, please use this format:

Describe the bug

A clear and concise description of what the bug is. What command did you run? What happened?

Steps to reproduce

In as much detail as possible, please provide steps to reproduce the issue. Sample data that triggers the issue, example model code, etc is all very helpful here.

Expected behavior

A clear and concise description of what you expected to happen.

Screenshots and log output

If applicable, add screenshots or log output to help explain your problem.

System information

Which database are you using dbt with?

  • [ ] postgres
  • [ ] redshift
  • [ ] bigquery
  • [ ] snowflake
  • [ ] other (specify: ____________)

The output of dbt --version:

<output goes here>

The operating system you’re using:

The output of python --version:

Additional context

Add any other context about the problem here.

HI Claire, sorry for the late response. Here is the information you’ve requested:

Describe the bug

If you attempt to recreate a model using --full-refresh, indexes that already existed on the table won’t be recreated as part of the full-refresh. If you add a new index to the model, only that index will be created, the existing indexes will not be recreated.

Steps to reproduce

  1. Create a model that contains post hooks for adding indexes if they don’t exist.

Something like this:

{{
config(
materialized=‘incremental’,
unique_key=‘id’,
post_hook=[
‘CREATE UNIQUE INDEX IF NOT EXISTS UI_{{ this.name }}id ON {{ this }} (“id”)’,
'CREATE INDEX IF NOT EXISTS IDX
{{ this.name }}survey_type_id ON {{ this }} (“survey_type_id”)’,
'CREATE INDEX IF NOT EXISTS IDX
{{ this.name }}_pulse_id ON {{ this }} (“pulse_id”)’
]
)
}}

SELECT
id,
survey_type_id,
pulse_id
FROM {{ ref(‘survey’) }}

  1. Confirm the model was created with the indexes expected. If the model didn’t exist before, it will contain all the indexes. If it’s a model that already exists with those indexes, the existing indexes won’t be recreated as part of the full-refresh.

  2. Add an index to the post hook then run dbt run --model modelName --full-refresh. Note that only the new index you added will be present but the ones that were on the model prior to the full-refresh are not recreated.

Expected behavior

My expectations are that the indexes should always be recreated as part of a full-refresh just like 0.14.0.

Screenshots and log output

No screenshots.

System information

Which database are you using dbt with?

We’re using Aurora Postgres

The output of dbt --version :

We’re using 0.16.1 on our Dev and QA enviornments. I can verify that on Production we’re using 0.14.0 and do not have these issues.

The operating system you’re using:

MAC

The output of python --version :

Python 2.7.10

Additional context

None

Hi @claire , I was wondering if you had an update on this particular bug or needed more information from me?

Thanks,
Will

hey @claire, we’re still dealing with this issue in DBT 0.17.0. Any idea when you guys might have time to look into this issue?

I figured out why this is going wrong on recent versions of dbt. (For local testing I’ve been using v0.18.0.)

Let’s talk about the differences between the two runs, both dbt run -m incremental_model --full-refresh. All code below is based on the reproduction example you provided above.

First run

dbt runs the following SQL statements:

-- step 1: create
create  table "jerco"."dbt_jcohen"."incremental_model"
  as (
    

SELECT
id,
survey_type_id,
pulse_id
FROM "jerco"."dbt_jcohen"."survey"
  );

-- step 2: post hooks
CREATE UNIQUE INDEX IF NOT EXISTS UI_incremental_modelid ON "jerco"."dbt_jcohen"."incremental_model" ("id");
CREATE INDEX IF NOT EXISTS IDXincremental_modelsurvey_type_id ON "jerco"."dbt_jcohen"."incremental_model" ("survey_type_id");
CREATE INDEX IF NOT EXISTS IDXincremental_model_pulse_id ON "jerco"."dbt_jcohen"."incremental_model" ("pulse_id");

Let’s check to make sure those indexes indeed exist:

select * from pg_indexes
where schemaname = 'dbt_jcohen'
  and tablename = 'incremental_model'
schemaname tablename indexname tablespace indexdef
dbt_jcohen incremental_model ui_incremental_modelid CREATE UNIQUE INDEX ui_incremental_modelid ON dbt_jcohen.incremental_model__dbt_backup USING btree (id)
dbt_jcohen incremental_model idxincremental_modelsurvey_type_id CREATE INDEX idxincremental_modelsurvey_type_id ON dbt_jcohen.incremental_model__dbt_backup USING btree (survey_type_id)
dbt_jcohen incremental_model idxincremental_model_pulse_id CREATE INDEX idxincremental_model_pulse_id ON dbt_jcohen.incremental_model__dbt_backup USING btree (pulse_id)

All good!

Second run

This time, dbt_jcohen.incremental_model already exists as a table in the database, so dbt will run a subtly different set of queries.

-- step 1: rename
alter table "jerco"."dbt_jcohen"."incremental_model" rename to "incremental_model__dbt_backup";

-- step 2: create
create  table "jerco"."dbt_jcohen"."incremental_model"
  as (
    

SELECT
id,
survey_type_id,
pulse_id
FROM "jerco"."dbt_jcohen"."survey"
  );

-- step 3: post hooks
CREATE UNIQUE INDEX IF NOT EXISTS UI_incremental_modelid ON "jerco"."dbt_jcohen"."incremental_model" ("id");
CREATE INDEX IF NOT EXISTS IDXincremental_modelsurvey_type_id ON "jerco"."dbt_jcohen"."incremental_model" ("survey_type_id");
CREATE INDEX IF NOT EXISTS IDXincremental_model_pulse_id ON "jerco"."dbt_jcohen"."incremental_model" ("pulse_id");

-- step 4: cleanup
drop table if exists "jerco"."dbt_jcohen"."incremental_model__dbt_backup" cascade;

Now, if we run the same query as before (select * from pg_indexes ...), we don’t get any results.

What happened?

Let’s think through this step by step:

  1. Via alter table..., dbt renames the “old” version of incremental_model to avoid a namespace collision. If we “pause” here and check pg_indexes, we’ll see that the three indices now reference incremental_model__dbt_backup instead of incremental_model.
  2. dbt creates the new version of incremental_model.
  3. dbt runs create index if not exists. Indices by those names do already exist, though they now reference incremental_model__dbt_backup.
  4. dbt cascade-drops incremental_model__dbt_backup, which also drops the indices.

The upshot here is that, on the third run, dbt would recreate the indexes on the right table, since they no longer exist.

Solutions

  • Instead of create index if not exists, you could run two statements: drop index if exists ... + create index. A normal drop index statement can be run inside a transaction (docs), so this is a pretty low-risk approach.
  • You could give each index a name that will be unique for each run/invocation of dbt, i.e. by including a dbt-Jinja context variable such as {{invocation_id}} or {{run_started_at}} in the index name construction. That way, the post-hooks should not find a same-named index already existing from a previous run.

I’m not sure why this “worked” on dbt v0.14.0. In any case, either of the approaches above should enable you to run this on a more recent version.

FYI

In the future, the best places for these bug reports are:

  • stackoverflow: ask a question, tag it dbt
  • dbt repo: open an issue if you believe the cause to be error in the dbt codebase

If you put it in one of those places, you’re likelier to get an answer much sooner :slight_smile:

@jerco

Thanks for taking the time to respond! I’ll make the appropriate changes!