CI/CD on Databricks using Azure Devops

Introduction

This blog post explains how to configure and build end to end CI/CD pipeline solutions on Databricks using Azure devops and best practices to deploy libraries in workspace using azure service principal in CI/CD pipeline for security aspects.

A Typical Azure Databricks pipeline includes following steps.

Continuous integration

  1. Develop code using databricks notebook or external IDE.
  2. Build libraries.
  3. Release — generate a release artifact.

Continuous deployment

  1. Deploy libraries or notebooks.
  2. Run automated tests.
  3. Programmatically schedule data engineering and analytics workflows.

Suppose you have developed your code using IDE or notebooks and committed to the Azure Git repository for which you would like to build a library whl or JAR file using DevOps principles.

Consider the following screenshot as your committed code for which you would like to build a library and pipeline.py as main python notebook which you would like to schedule for running analytics workflows.

Learn More

Tags: Azure DevOps