Drop columns in a dbt model

Hi there,

I have a macro that will add some extra columns to an existing model and performs some data cleansing as per business requirements.

After the data been cleansed, what is the best way to delete the unwanted columns?

Suppose an input data(model) may have millions of columns and we don’t care what are the other columns other than the known columns that are to be deleted.

Below code in a dbt model can select the columns in the list but I don’t know how to delete/not select the columns in the list. Any thoughts?

{% set columns_to_delete = ["column_1", "column_2", "column_3"] -%}

select 
    {% for column in columns_to_delete %}
        {{ column }},
    {% endfor %} 

I have checked that {% for column **not** in columns_to_delete %} is not working.

Thanks in advance :slight_smile:

hey @s.chaser!
I am unsure if by I don't know how to delete/not select the columns in the list you’re wanting to exclude them from a query or literally drop the specific columns from a table’s ddl.

If you’re just wanting to exclude a set of columns, you can utilize dbt_utils.star like this:

select
{{ dbt_utils.star(from=ref('my_model'), except=["exclude_field_1", "exclude_field_2"]) }}
from {{ ref('my_model') }}

If you need to specifically drop them from a table (assuming you’re using a snowflake warehouse), I imagine you could create a macro like this:

{% macro drop_cols(schema, table, col_list = [], debug = true) %}

{% set sql %}
  alter table {{target.database}}.{{schema}}.{{ table }} drop 
  {% for col in col_list %}
    {{col}}
    {% if not loop.last %},{% endif %}
  {% endfor %}
{% endset %}

{% log(sql, true) %}
{% if not debug %}
  {{ run_query(sql) }}
{% endif %}
{% endmacro %}
1 Like

Hey @Reidwil Thanks for your help. It looks perfectly what I wanted to achieve. I will give it a go when have time.
Merry Xmas!

@Reidwil
Great to point me to macro dbt_utils.star

@Reidwil

Tested below basic macro:

  {#
  -- debug = true, 
  #}

{% macro drop_cols(
  schema,
  model_name,
  col_list = []
) %}

{% set sql %}
  alter table {{target.database}}.{{schema}}.{{ model_name }} drop 
  {% for col in col_list %}
    {{col}}
    {% if not loop.last %},{% endif %}
  {% endfor %}
{% endset %}

{#
-- {% log(sql, true) %}
-- {% if not debug %}
#}
{%- if execute -%}
  {{ run_query(sql) }}
{% endif %}
{% endmacro %}

when called the macro,

select *
from ({{ drop_cols( 
    schema = 'xxx',  
    model_name = 'DATA_DROP_COLUMNS',
    col_list = ['COL2','COL3','COL4']
    ) 
}})

The result showing that 3 columns already dropped as expected from DATA_DROP_COLUMNS,

with below error messages:


Completed with 1 error and 0 warnings:

Database Error in model drop_columns (models\drop_columns\drop_columns.sql)
  001003 (42000): SQL compilation error:
  syntax error line 7 at position 0 unexpected '|'.
  syntax error line 13 at position 6 unexpected ')'.

Need spent more time to figure out what’s going on here :slight_smile:

Thanks again for the great help :grinning_face_with_smiling_eyes: :laughing:

the compiled message is showing as below:

select *
from (



| column | data_type |
| ------ | --------- |
| status | Text      |


)

Happy Holidays! In this situation, the macro was written to be run as a stand alone run-operation like so:
dbt run-operation drop_cols --args '{schema: xxx, model_name: DATA_DROP_COLS, col_list: ['COL2', 'COL3', 'COL4']}'.

Hope that helps!

Why not use a simple post_hook?

{{
config(
post_hook=“ALTER TABLE some_table DELETE COLUMN col1, col2, col3”
)
}}

Should work for snowflake!

1 Like

Hi Reidwil, Happy new year!
It definitely helps me to understand the case. Thank you heaps for the time.

Happy new year @datachef !

It is a good direction to try out, but unfortunately I had no luck with my limited knowledge using post_hook in dbt models.

I have tried different variations but all failed with different error messages.

{{
config(
post_hook="ALTER TABLE DATA_DROP_COLUMNS DELETE COLUMN col1, col2, col3"
)
}}

select *
from {{ ref('data_drop_columns')}}

Here is the error message:

Completed with 1 error and 0 warnings:

Database Error in model drop_columns_hook (models\drop_columns\drop_columns_hook.sql)
  001003 (42000): SQL compilation error:
  syntax error line 1 at position 30 unexpected 'DELETE'.

Are you able to point out what I have done wrong here?

Thanks again

Check out the drop column syntax here!

Thanks for the link @Reidwil — I have read multiple times of the linked info, still have no clue how to fix it with my limited understanding.
I don’t see any syntax error here :sweat_smile:

The example syntax to dropping a column in the above link is:

alter table t1 drop column a2;

whereas your post hook statement is:

alter table t1 delete column a2;

Delete isn’t the correct keyword. Drop is the correct keyword. (hence the error message around unexpected 'DELETE' error you’re getting)

ohh my god, I am sweating now…I was blinded!
Thank you @Reidwil for being patient with me :slight_smile:

It is now working as expected!


select * 
from {{ ref('data_drop_columns')}}
{{
config(
post_hook="ALTER TABLE SCHEMA_NAME.THIS_MODEL_NAME DROP COLUMN col2, col3, col4"
)
}}

Any chance here to improve from changing SCHEMA_NAME to whatever schema the current model being created in? This because the current model is created in a different schema defined by dbt_project.yml rather than settings in the profiles.yml.

I guess there should be smarter way to do so, just have no luck to find anything to help me to do so.

Thank you again for being so helpful to direct me on the right thing to do.

I believe you can use {{ target.schema }}

1 Like

@Reidwil This is very helpful.

Now it works as a gem :slight_smile:


select * 
from {{ ref('data_drop_columns')}}
{{
config(
post_hook="ALTER TABLE {{ THIS.SCHEMA }}.THIS_MODEL_NAME DROP COLUMN col2, col3, col4"
)
}}

This will ensure the columns in the dbt model are dropped as required no matter what is the custom schema.

This is a great learning process. It helps me to understand how it work here!

Thanks for both of you @datachef and @Reidwil :grinning_face_with_smiling_eyes: :smiley: