Task Parameters and Values in Databricks Workflows

Databricks provides a set of powerful and dynamic orchestration capabilities that are leveraged to build scalable pipelines supporting data engineering, data science, and data warehousing workloads. <a href="https://www.databricks.com/product/workflows" rel="noopener ugc nofollow" target="_blank">Databricks Workflows</a> allow users to create jobs that can have many tasks. Tasks can be executed in parallel isolation and can set to follow specific dependencies. Below are the items that we will discuss while focusing on creating parameterized jobs using the user interface and APIs. <ul> <li>Ephemeral job clusters</li> <li>Reusing job clusters</li> <li>Linking jobs to external git repositories</li> <li>Passing values between tasks within a job</li> <li>Parameterizing tasks</li> </ul> Please note that we will focus on notebook tasks in Databricks, however, much of what we discuss will be similar for other task types. <a href="https://medium.com/@24chynoweth/task-parameters-and-values-in-databricks-workflows-ea4cfbb473b3">Website</a>