dbt Fusion and `static_analysis`

pempey · June 18, 2025, 3:57pm

I have been having a discussion about the static_analysis config in the dbt Fusion engine on the #dbt-fusion-engine Slack channel but I want to bring the discussion to a wider audience. I will link to the thread but it contains a few different topics so I will summarize the discussion so far. It should be noted that my viewpoint is from a dbt Core only project and is not using any of the additional features of the dbt Platform.

Slack Thread

Current State

The dbt Fusion engine attempts to compile and validate the SQL in a project before any SQL is executed, static_analysis:on. If a model contains an introspective query, meaning it needs to get information about another object via a SQL query, or the schema of the source model is not known then the Fusion engine will, as a default, skip the compile and validation of that model, and all downstream objects, until it is run where it is compiled and executed but not validated, static_analysis:off. The default can be overridden by setting static_analysis config to unsafe witch changes the behavior to delay the compile and SQL check to happen just before a model is executed instead of before any SQL is executed.

This default behavior is designed to ensure that if the SQL is reported an valid that the validation will remain true even if the run is separated in time from the compile. And, if the compilation of a model is independent of any other model or SQL queries then the compilation can be parallelized to increase performance of the compile process.

Concerns with the Current State

The fist concern I have with the current state is that is assumes that the database schema is volatile and changes in an unknowable fashion. In our production environment only dbt is changing the database schema and is orchestrated is a way that minimizes any overlap. And in our development environment our database are isolated at the developer level and can only be modified by the developer, typically using dbt. So automatically marking models with introspective queries as static_analysis:off makes an evaluation of risk for me with out understanding the environment.

The second concern I have is with the language, unsafe, used as the default override configuration. Introspective queries have been a regular, if not promoted, part of dbt model development for a long time and suddenly telling developers that they are doing something ‘wrong’ by using them feels heavy handed and contrary to previous assumptions. Common utilities like date_spine and get_column_values fall in to the unsafe category.

My third concern is that automatically tunning off the validation of SQL has an outsized impact on the development process where it would be needed the most and the risk of the run SQL not matching the validated SQL is the least. And the fact that the validation to tunned off silently for all models down stream of a model with an introspective query can be confusion when a SQL error is reported from the database and the developer assumed that it should have been checked before a query was sent.

Proposal

I propose that the language be changed to better describe the functionally of the static_analysis feature and move away from the ‘correctness’ of the behavior. The labels auto always and never could replace the the current labels of on unsafe and off without changing the functionally but more clearly communicating what would happen to a model with the config.

I do not personally like the default for static_analysis to be auto because it is may change the behavior of the models in a way that is not obvious to the developer, but I can also see it as a good starting point in some projects. In our project I am reasonably sure that we will set it up to always at the project level and then manage exceptions to that on a model by model basis.

Questions for the Community

I’d love to hear from others in the community about their experiences with the Fusion engine’s static_analysis behavior:

How would the current static_analysis configuration affected your workflows?
Do you find the unsafe terminology problematic or confusing?
What would be your ideal behavior for handling introspective queries in the compilation process?
Are there specific use cases where the current behavior would caused issues or benefits?

Looking forward to the discussion and hopefully finding a path forward that better serves the diverse needs of the dbt community.

gclunies · June 25, 2025, 8:32pm

Cross posting my 2 cents from (Slack) for visibility.

I think a big issue here is:

Traditionally dbt evangelized how jinja, macros, and dbt packages (like dbt_utils ) were great tools to save developers time by accomplishing a lot with few characters. Even if the complexity went up and readability went down, it was a worthwhile trade.

dbtf has the same intention of saving developers time, but it is in complete contrast with the established and encouraged dbt patterns of jinja, macros, and packages should be embrace. dbtf seems to heavily discourage their use.

FWIW - I much prefer the labels auto always and never that @pempey suggests.

joellabes · July 1, 2025, 6:03am

Great points here, thank you both - I have been chewing on this for a couple of weeks but figured I should get some quick notes down!

A couple of high level points:

I don’t think we’ve done a good enough job of explaining the difference between Just In Time compilation^[1] vs Ahead Of Time compilation^[2] and how they interact with static analysis^[3]. I have a note to self to flesh this out. In both JIT and AOT compilation, the models are still statically analysed (as long as it hasn’t been turned off) - it’s just a function of when that validation happens.
I’m somewhat sympathetic to the argument that unsafe/always mode should be the default. But given that it only takes one line of config in your dbt_project.yml to change the default, I don’t have super strong feelings about this.
I’m extremely sympathetic to the idea that there are common macros which currently require introspection, and that it feels icky. My thesis is that a not-insignificant number of utils-style queries may not need to be introspective in the long term. To address a couple:
- dbt.date_spine has no business needing an introspective query; that’s just a hangover from the late 2010s. We’re gonna rustify that so it’s static
- dbt_utils.star is redundant for most warehouses at this point - we should consider returning native select * except(...) where possible. Even when the target warehouse doesn’t support it, the dbt Fusion engine now knows the column names and types of models after its initial caching step, so it should be able to be done statically.

Quoting Alex from an internal thread:

if you do a lot of introspection, your SQL is not guaranteed to be idempotent upon execution especially if the introspection happens across dependencies (i.e. not upstream) or against external tables / sources. Turning static analysis off is our way of saying, “we can’t guarantee this,” and marking it as unsafe is your way of saying “I know, give it to me anyway”.

I don’t know whether giving a dangerous config a less-scary name does anything useful - it is kinda a feature that you get prompted to think “do I really need this”? For example, last year I worked on a model which did a ton of pivoting. I used get_column_values liberally, but looking back, I actually could have defined them in a Jinja dict and it would have been clearer and more resilient (I had a downstream Looker model which depended on those exact columns existing, and no others).

Appreciate you both taking the time to write this up and keen to hear what others think!

dbt Core’s only option, how the Fusion engine handles introspective queries ↩︎
The Fusion engine’s default behaviour ↩︎
Checking that the rendered code is syntactically valid ↩︎

joellabes · July 16, 2025, 10:50pm

I’m becoming progressively more sympathetic to changing the default:

while writing updated docs, I couldn’t come up with non-contrived examples of when unsafe isn’t the good default option. The biggest one is that a CI job which just runs dbt compile --defer as its first step to check if there are any egregious errors - if it was a dbt run then rendering and analysis would wait until the model had been built, but a standalone compile will do a best-effort compile based on the production table which might result in nonexistent columns.
Changing all models to unsafe static analysis also switches all models to JIT rendering, which can add a significant performance penalty. Better to just have it surgically applied where it’s needed

Watch this space

pempey · July 17, 2025, 12:31pm

The JIT rendering is what we are all used to with dbt so how can you call that a penalty? Also it is a presumption that the potential performance gains from AOT rendering are ‘better’ than the overall potential performance gains from always pre-checking the SQL. Also it assumes that every developer will know when to ‘surgically apply’ the setting and fully understand the impacts when doing it when once config on a model can affect an unknown number of models downstream.

joellabes · July 18, 2025, 7:48am

Yep we are in complete agreement! When I said “surgically apply” I was meaning that Fusion itself is best at marking only the necessary nodes as unsafe. In contrast to either asking end users to do it to specific nodes, or just saying "set +static_analysis: unsafe in dbt_project.yml across the board.

All this to say that there’s a PR up at the moment which will automatically mark models for unsafe static analysis when it encounters problematic introspection instead of turning it off altogether.

I also have some new docs drafted which I’m feeling very pleased with and will be eager to see how they land in terms of explaining all this.

pempey · July 18, 2025, 1:28pm

@joellabes Ahh, I see I did not interpret your statement correctly. I look forward to the new docs and the new settings.

Topic		Replies	Views
DBT Linter that parses Jinja instead of compiled SQL Archive	1	3498	July 19, 2022
Writing queries in a BI tool vs modeling them in dbt In-Depth Discussions best-practice , bi	4	4840	June 21, 2018
Run dbt without compiling Archive	2	9022	January 12, 2022
Setting up VSCode to use with the dbt CLI Show and Tell tools , cli , vs-code	12	87280	December 11, 2023
State of testing in dbt In-Depth Discussions testing	5	13876	January 20, 2021

dbt Fusion and `static_analysis`

Current State

Concerns with the Current State

Proposal

Questions for the Community

Related topics