Skip to main content

New concepts beta

info

The dbt Fusion engine is currently in beta and the related documentation is a work in progress. The information on this page will evolve as features are added and enhanced. Join the conversation in our Community Slack channel #dbt-fusion-engine.

Read the Fusion Diaries for the latest updates.

The dbt Fusion engine fully comprehends your project's SQL, enabling advanced capabilities like dialect-aware validation and precise column-level lineage.

It can do this because its compilation step is more comprehensive than that of the dbt Core engine. When dbt Core referred to compilation, it only meant rendering — converting Jinja-templated strings into a SQL query to send to a database.

The dbt Fusion engine can also render Jinja, but then it completes a second phase: producing and validating with static analysis a logical plan for every rendered query in the project. This static analysis step is the cornerstone of Fusion's new capabilities.

Stepdbt Core enginedbt Fusion engine
Render Jinja into SQL
Produce and statically analyze logical plan
Run rendered SQL

Rendering strategies

Each dot represents a step in that model's execution (render, analyze, run). The numbers reflect step order across the DAG. JIT steps are green; AOT steps are purple.Each dot represents a step in that model's execution (render, analyze, run). The numbers reflect step order across the DAG. JIT steps are green; AOT steps are purple.
 JIT rendering and execution (dbt Core)

dbt Core will always use Just In Time (JIT) rendering. It renders a model, runs it in the warehouse, then moves on to the next model.

 AOT rendering, analysis and execution (dbt Fusion engine)

The dbt Fusion Engine will default to Ahead of Time (AOT) rendering and analysis. It renders all models in the project, then produces and statically analyzes every model's logical plan, and only then will it start running models in the warehouse.

By rendering and analyzing all models ahead of time, and only beginning execution once everything is proven to be valid, the dbt Fusion Engine avoids consuming any warehouse resources unnecessarily. By contrast, SQL errors in models run by dbt Core's engine will only be flagged by the database itself during execution.

Rendering introspective queries

The exception to AOT rendering is an introspective model: a model whose rendered SQL depends on the results of a database query. Models containg macros like run_query() or dbt_utils.get_column_values() are introspective. Introspection causes issues with ahead-of-time rendering because:

  • Most introspective queries are run against the results of an earlier model in the DAG, which may not yet exist in the database during AOT rendering.
  • Even if the model does exist in the database, it might be out of date until after the model has been refreshed.

The dbt Fusion Engine switches to JIT rendering for introspective models, to ensure it renders them the same way as dbt Core.

Note that macros like adapter.get_columns_in_relation() and dbt_utils.star() can be rendered and analyzed ahead of time, as long as the Relations they inspect aren't themselves dynamic. This is because the dbt Fusion Engine populates schemas into memory as part of the compilation process.

Principles of static analysis

Static analysis is meant to guarantee that if a model compiles without error in development, it will also run without compilation errors when deployed. Introspective queries can break this promise by making it possible to modify the rendered query after a model is committed to source control.

The dbt Fusion Engine is unique in that it can statically analyze not just a single model in isolation, but every query from one end of your DAG to the other. Even your database can only validate the query in front of it! Concepts like information flow theory — although not incorporated into the dbt platform yet — rely on stable inputs and the ability to trace columns DAG-wide.

Static analysis and introspective queries

When Fusion encounters an introspective query, that model will switch to just-in-time rendering (as described above). Both the introspective model and all of its descendants will also be opted in to JIT static analysis. We refer to JIT static analysis as "unsafe" because it will still capture most SQL errors and prevent execution of an invalid model, but only after upstream models have already been materialized.

This classification is meant to indicate that Fusion can no longer 100% guarantee alignment between what it analyzes and what will be executed. The most common real-world example where unsafe static analysis can cause an issue is a standalone dbt compile step (as opposed to the compilation that happens as part of a dbt run).

During a dbt run, JIT rendering ensures the downstream model's code will be up to date with the current warehouse state, but a standalone compile does not refresh the upstream model. In this scenario Fusion will read from the upstream model as it was last run. This is probably fine, but could lead to errors being raised incorrectly (a false positive) or not at all (a false negative).

 Rendering and analyzing without execution

Note that model_d is rendered AOT, since it doesn't use introspection, but it still has to wait for introspective_model_c to be analyzed.

You will still derive significant benefits from "unsafe" static analysis compared to no static analysis, and we recommend leaving it on unless you notice it causing you problems. Better still, you should consider whether your introspective code could be rewritten in a way that is eligible for AOT rendering and static analysis.

Recapping the differences between engines

dbt Core:

  • renders all models just-in-time
  • never runs static analysis

The dbt Fusion engine:

  • renders all models ahead-of-time, unless they use introspective queries
  • statically analyzes all models, defaulting to ahead-of-time unless they or their parents were rendered just-in-time, in which case the static analysis step will also happen just-in-time.

Configuring static_analysis

Beyond the default behavior described above, you can always modify the way static analysis is applied for specific models in your project. Remember that a model is only eligible for static analysis if all of its parents are also eligible.

The static_analysis options are:

  • on: Statically analyze SQL. The default for non-introspective models, depends on AOT rendering.
  • unsafe: Statically analyze SQL. The default for introspective models. Always uses JIT rendering.
  • off: Skip SQL analysis on this model and its descendants.

When you disable static analysis, features of the VS Code extension which depend on SQL comprehension will be unavailable.

The best place to configure static_analysis is as a config on an individual model or group of models. As a debugging aid, you can also use the --static-analysis off or --static-analysis unsafe CLI flags to override all model-level configuration. Refer to CLI options and Configurations and properties to learn more about configs.

Example configurations

Disable static analysis for all models in a package:

dbt_project.yml
name: jaffle_shop

models:
jaffle_shop:
marts:
+materialized: table

a_package_with_introspective_queries:
+static_analysis: off

Disable static analysis for a model using a custom UDF:

models/my_udf_using_model.sql
{{ config(static_analysis='off') }}

select
user_id,
my_cool_udf(ip_address) as cleaned_ip
from {{ ref('my_model') }}

When should I turn static analysis off?

Static analysis may incorrectly fail on valid queries if they contain:

  • syntax or native functions that the dbt Fusion Engine doesn't recognize. Please open an issue in addition to disabling static analysis.
  • user-defined functions that the dbt Fusion Engine doesn't recognize. You will need to temporarily disable static analysis. Native support for UDF compilation will arrive in a future version - see dbt-fusion#69.
  • dynamic SQL such as Snowflake's PIVOT ANY which cannot be statically analyzed. You can disable static analysis, refactor your pivot to use explicit column names, or create a dynamic pivot in Jinja.
  • highly volatile data feeding an introspective query during a standalone dbt compile invocation. Because the dbt compile step does not run models, it uses old data or defers to a different environment when running introspective queries. The more frequently the input data changes, the more likely it is for this divergence to cause a compilation error. Consider whether these standalone dbt compile commands are necessary before disabling static analysis.

Examples

No introspective models

 AOT rendering, analysis and execution
  • Fusion renders each model in order.
  • Then it statically analyzes each model's logical plan in order.
  • Finally, it runs each model's rendered SQL. Nothing is persisted to the database until Fusion has validated the entire project.

Introspective model with unsafe static analysis

Imagine we update model_c to contain an introspective query (such as dbt_utils.get_column_values). We'll say it's querying model_b, but the dbt Fusion Engine's response is the same regardless of what the introspection does.

 Unsafe static analysis of introspective models
  • During parsing, Fusion discovers model_c's introspective query. It switches model_c to JIT rendering and opts model_c+ in to JIT static analysis.
  • model_a and model_b are still eligible for AOT compilation, so Fusion handles them the same as in the introspection-free example above. model_d is still eligible for AOT rendering (but not analysis).
  • Once model_b is run, Fusion renders model_c's SQL (using the just-refreshed data), analyzes it, and runs it. All three steps happen back-to-back.
  • model_d's AOT-rendered SQL is analyzed and run.
 Complex DAG with an introspective branch

As you'd expect, a branching DAG will AOT compile as much as possible before moving on to the JIT components, and will work with multiple --threads if they're available. Here, model_c can start rendering as soon as model_b has finished running, while the AOT-compiled model_x and model_y run separately:

More information about Fusion

Fusion marks a significant update to dbt. While many of the workflows you've grown accustomed to remain unchanged, there are a lot of new ideas, and a lot of old ones going away. The following is a list of the full scope of our current release of the Fusion engine, including implementation, installation, deprecations, and limitations:

0