The problem I’m having
I have a dbt model materialized as table (snowflake). This model has a complex query with multiple CTEs and multiple JOINS. The snowflake query planner comes up with a very bad decision about the order and filtering, leading to really long execution times.
By splitting this model in a couple of models each one materialized as table that I reference from the original model, I force Snowflake to JOIN /filter stuff in the “right order” . This cuts the execution time from 40 minutes to 40 seconds.
So this works but I don’t want to leave those “support tables” lying around.
I wonder if there anyway to do any of the following:
- make dbt to create this models as “CREATE TEMPORARY TABLE” so that the table is automatically dropped by snowflake when the session is over.
- make dbt execute an explicity DROP TABLE IF EXIST for those tables, is post-hook the only way?
The context of why I’m trying to do this
What I’ve already tried
I’ve tried ephemeral models but that is not the same , as it’s a CTE then then Im still at the mercy of the query planner, also I want to reuse this intermediate results in more that one table model (a CTE will mean recomputing it, most likely, depending on the caching)
I considered on-run-end
hook, and I think that works but I don’t want to run the DROP TABLE if I’m not building that particular model, but this works. I just wonder if there is a better alternative