Internal working of incremental

Hi, In incremental materialisation what is the strategy it follows for backfill the new column,
Config : file type =orc , on_schema_change=append_new_columns

I am using it, its filling null.
My understanding is :
It dosen’t add anything in old data files. Just add a new column in index data for older files, and keep row data as it is.

But i have some doubts on it. If this is the case with ORC format, does it shows different behaviour with other formats?

If there is another reason why its not backfilled, and is there a way to achieve it without full refresh? please share.

We use it with BigQuery, and if you are using incremental loads with the append_new_columns config, then the new columns will be null for rows populated previously (pretty much what the documentation says). If you want to populate the values for the new column for the previously generated rows, you need to do a full refresh (run with “-f” or “–full-refresh” flag).

Hi, yes i agree (mentioned in original post) But why does it show such behaviour?

Plagiarising this post by Tristan since its a bit outdated now as dbt supports adding columns to incremental tables.
Basically from my understanding, on every dbt run for an incremental model:-

  1. Add column to destination table if schema has changed*
  2. generate a temp table with the contents of your model select statement
  3. delete rows from your existing table with unique keys that match keys that exist in your temp table
  4. insert all records from your temp table into the target table
  5. drop the temp table

Now as you can observe it only generates data for the new column(s) for the last batch of records as a result the new columns remain null for the rest of the rows.