I am trying to integrate DBT in a workflow where I have a set of table in big query being continuously updated by a 3rd party tool (insert or update only).
I know that the provider is supposed to guarantee the consistency of the table with respect to some statements, ie. the set of table is always updated with the transaction feature.
Know, I am wondering if I can regularly extract a snapshot of all the tables throughout the use of multi statement transactions in bigquery:
To me it seems that for a specific set of model (in my case the snapshot models), I could potentially configure pre/post hook in dbt project like in here: pre-hook & post-hook | dbt Developer Hub to use that feature for my usecase.
In my understanding it should be possible to have the pre-hook (ie begin transaction) running before the first model of the directory is executed, and the post hook (end) is ran after the last model of the directory.
My questions are now:
1: Is my understanding correct, a pre-post hook can be ran only once before and after a set of models ?
2: See the following comment from dbt hook documentation:
To achieve this, you can use one of the following syntaxes. (Note: You should NOT use this syntax if using a database where dbt does not use transactions by default, including Snowflake, BigQuery, and Spark/Databricks.)
BEGIN BEGIN TRANSACTION; INSERT INTO mydataset.NewArrivals VALUES ('top load washer', 100, 'warehouse #1'); -- Trigger an error. SELECT 1/0; COMMIT TRANSACTION; EXCEPTION WHEN ERROR THEN -- Roll back the transaction inside the exception handler. SELECT @@error.message; ROLLBACK TRANSACTION; END;
Why one couldn’t use this syntax with bigquery as bigquery is supposed to support it ?
Thank you in advance for your help