Upserting Data from Databricks to Azure SQL DB

In this article it’s gonna be explained two different methods that can be used for upserting data from the Databricks lakehouse platform to Azure SQL DB. Code snippets can be found on my github: ssaenzf/databricksLearning: Repo with code from articles related with the Databricks lakehouse platform (github.com). Hope you enjoy :) The medallion architecture is used to logically organize data in a lakehouse, with the goal of incrementally and progressively improving the structure and quality of data as it flows through each layer of the architecture. Sometimes it ocurrs that there is a need to further move data from the silver or gold layers to a database. This use case usually ocurrs when legacy systems are already consuming the SQL DB, and there is reticence on consuming directly from the lakehouse because of the need to change the driver connection on the consumer processes. Nevertheless when possible data should be consumed directly from Databricks. This way no further processing is needed for moving data from Databricks to the SQL DB, and space it’s saved in the SQL DB, with the consequent save in money as space in the SQL DB is charged per size more higher than in Databricks, as Databricks storage uses a file based storage. Databricks offers SQL Warehouse clusters with Photon acceleration for optimizely consuming data though a JDBC or ODBC drivers. <a href="https://levelup.gitconnected.com/upserting-data-from-databricks-to-azure-sql-db-46627a407930">Read More</a>