Intelligent data application adoption is rapidly growing amongst various data platforms. And with this growing adoption, engineers are looking for efficient and flexible APIs to work with their data in a way they’re used to operating pipelines, but inside such data applications.
In this blog post, I’ll demonstrate how the upgraded Databricks Connect protocol (based on Spark Connect) enables developers to write concise and expressive data applications on top of the Databricks Lakehouse Platform with the Dash framework.
The Tale of Protocols
There are several ways to perform operations with the data stored in the Databricks Lakehouse from the data application.
One of them is the one I’ve described in one of my previous blog posts — it’s a straightforward and well-known approach that uses DBSQL and the SQLAlchemy ORM mechanism.
However, this time I would like to highlight the (relatively) new functionality of Databricks, called Databricks Connect “V2”. Although for many of the Databricks users, this component may sound familiar, its new version provides a more flexible and faster approach within a thinner version of a client that can be potentially used from applications in various programming languages.