Inserting data into Table A with sequential transformations in dbt

Ajinkya · August 2, 2023, 9:17am

Hi all,
I’m facing a unique challenge with data insertion and transformation using dbt. Here’s the situation:

Step 1: I have raw data that requires some transformations, and the result needs to be inserted into Table A.
Step 2: Once Step 1 is complete, I need to perform additional transformations using the data from Table A and then insert the output of Step 2 back into Table A.

I’m seeking some guidance on the best approach to handle this scenario in dbt. If any of you have encountered a similar situation or have insights to share, I’d greatly appreciate your input. Thanks for your time!

dlmolloy97 · August 2, 2023, 9:57am

Hello, Ajinkya.

The way that I would approach this is using common table expressions (CTEs) within a single dbt model. Pseudocode is included beneath.

Please note that I am assuming here that when you say that you want to ‘insert’ data into Table A, this would be in the form of new columns, so that the user can see which data has or has not been transformed.

with original as  (SELECT * from {{ref(source)}},

first_transformation as (SELECT 
# This preserves the original data from Table A
*,
avg(sales) over (partition by region order by 1) as sales_regional_average
# Further transformed columns here
from original),

second_transformation as (select *,
# Here, the transformed data is further altered. I would usually do this all in one line, but separating it out here for example.
round(sales_regional_average, 1) as sales_regional_average_rounded)

select * from second_transformation

I also assume that you do not need to have first_transformation as its own model. If you do, split the steps above across different .sql files/models.

Ajinkya · August 2, 2023, 10:29am

Thank you for the suggestion,
I would like to mention that ‘insert’ data into Table A, this would be in the form of new rows not in the form of columns.

below is the detail scenario:

Table A is responsible for maintaining historical data, currently consisting of approximately 750 million records.

Step 1: In this step, we perform an insertion of 100 new rows into TABLE A.

Step 2: After completing Step 1, we utilize the updated TABLE A, which now includes the 100 newly inserted records from Step 1. Using this updated data, we recalculate and identify any missing data (gaps) within TABLE A. Subsequently, we insert the recalculated missing data into TABLE A.

In summary, the process involves two steps: first inserting 100 new records, and then using these new records to recalculate and fill in any gaps in the historical data within TABLE A.

suryanarayana.v.raju · January 30, 2025, 11:59am

I have a similar use case i have table A upon which i need to run a logic/code and insert the result back to table A and then run the logic again on table A to get the result and insert back to table A and do the same until my logic is no longer giving any records to insert. If your problem is unlike mine and has only couple of steps of transformations i assume you could use staging models to accomplish your task.

Topic		Replies	Views
How to insert data in existing tables using dbt Archive	1	11443	July 30, 2021
getting database error while appending data from one table to another table Help	18	1321	July 6, 2023
Mutiple DBT models for one table Archive	3	9650	April 14, 2022
Issues with dbt Sequential insert - Does it follow order of queries? Help incremental	0	896	March 3, 2023
How to Perform Insert only in DBT Help bigquery , dbt-core	5	1078	August 21, 2024

Inserting data into Table A with sequential transformations in dbt

Related topics