
Building an AI Application with Databricks Apps in 30 Days
Discover how to build a production-ready AI application on Databricks Apps in just under a month. Learn from our journey, challenges, and architectural choices.
dbt Labs recently announced dbt Fusion, a complete overhaul of the dbt Core engine built in Rust. It promises to significantly improve the developer experience. In this article, we test its core features and share our hands-on experience with the public beta, exploring what works, what doesn't, and what potential it holds for the future of dbt development.
On May 28th, dbt Labs announced dbt Fusion, a complete overhaul of the engine that powers dbt.
In this article, I'll cover the core features and goals of dbt Fusion. Then, I'll share my first impressions from putting the public beta through its paces.
At its core, dbt Fusion is a ground-up rewrite of the dbt Core engine. It's intended to provide the same functionality but as a more optimized version with some added features as well. The most significant architectural change is the switch from Python to Rust. Rust is renowned for its performance and memory safety, and dbt Labs is leveraging it to deliver a faster, more robust experience.
The move to Rust promises significant performance improvements. However, it's crucial to understand where you'll feel this speed boost. dbt Fusion accelerates project parsing and SQL compilation—the steps dbt performs locally before sending code to your data warehouse.
The actual execution of your models still happens on your data warehouse, and dbt Fusion or Core has no impact on that runtime. Since parsing and compilation are often a small fraction of a total production run's duration, the effect on your pipeline's end-to-end runtime will likely be negligible.
The primary goal of dbt Fusion, as stated by dbt Labs, is not to shorten pipeline runtimes but to enhance the developer experience. This is the lens through which we should evaluate it.
dbt Labs didn’t just release a new engine, they also introduced a collection of new components designed to work together. dbt Fusion uses new database adapters built on the Apache Arrow DataBase Connectivity (ADBC) standard, aiming for more efficient data transfer. The move to Rust also required a new Jinja engine. Finally, they released a VS Code Extension that leverages Fusion's capabilities to provide a powerful, IDE-native development environment.
Perhaps the most fundamental change is that dbt is no longer just a sophisticated text processor. The Fusion engine can now parse and understand the rendered SQL it generates.
This is a significant change. Previously, dbt treated your model code as a string to be manipulated with Jinja until it was ready to be sent to the warehouse. Now, by parsing the SQL, dbt Fusion enables powerful new capabilities. For instance, it allows for local syntax validation, letting you catch SQL errors directly in your IDE before ever running a command. Furthermore, this deep understanding of your code's structure is what enables a rich IDE integration, powering the features in the new VS Code extension.
The VS Code extension is not a part of the engine itself, but it is arguably the biggest contributor to the "improved developer experience" that dbt Labs is aiming for. It promises a dbt language server with features like:
Theory is one thing, but how does it work in practice? So I decided to put the beta to the test.
Disclaimer: dbt Fusion is in public beta. The features are incomplete, and bugs are expected. I tested on a Windows machine, so your experience may vary on macOS or Linux.
First, you need to install dbt Fusion. It's distributed as a standalone binary executable, which is very convenient. You don't need to have Python installed and you can avoid the hassle of creating and managing virtual environments. Just download the binary and add it to your PATH.
You can find the official installation guide here: Install dbt Fusion
To try Fusion, your project must use a supported data platform. At launch, only Snowflake was available, with Databricks and BigQuery adapters planned for release in June. Additionally, Python models are not yet supported, so if your project relies on them, you'll have to wait.
Your project might also use deprecated YAML configurations. While dbt Core tolerates these with warnings, dbt Fusion will eventually fail on them. You can use the dbt-autofix tool to easily update your project's syntax. You can run it simply via uvx, it worked perfectly for me.
Running a few CLI commands, the speed improvement is noticeable. However, I ran into some rough edges:
The VS Code extension is where the promise of an enhanced developer experience really lies. Unfortunately, in its current beta state, it falls short of expectations.
It did not work for me out-of-the-box, I had to do some troubleshooting. The setup requires you to register the extension with dbt Cloud and they advise you to turn of all other dbt related extensions that you may be using. Even after completing all these steps, the extension did not deliver the experience I was hoping for.
Most of the advanced editing features were completely non-functional for me. Autocomplete, "Go to Definition," live error detection, and informational hovers on tables/columns did not work at all. Despite significant time spent troubleshooting, I couldn't get them running, and the LSP server provides very little feedback for debugging.
On a positive note, the model and column-level lineage visualization works well. It's a great feature, though the user experience could be improved. On large projects, the graph becomes overcrowded quickly. Features like filtering on the UI or focusing the graph would be a welcome addition.
dbt Fusion represents a new direction for the future of dbt, especially if we consider the license changes as well (a topic that deserves its own article). While the ability to parse SQL and the new architecture unlock immense potential, the beta shows there is still a long way to go. While the CLI has some quirks, the VS Code extension - the key to the promised developer experience - is currently very rough.
For Fusion to see widespread adoption, these issues must be ironed out during the beta period to ensure a seamless transition from dbt Core at general availability. The team is actively working on fixes, as seen in the project's GitHub repository. It will be worth checking in on their progress in a few weeks.