Managing Databricks At A Wide Scale

<h1>Introduction</h1> <p>I somehow managed to convince our data platform team that I need a higher level of access so I can see everyone’s cluster and job configurations. My OCD regarding FinOps was quickly confirmed after starting to peruse our inventory. While we do have various guardrails in place, the sheer amount of clusters and jobs means that there’s big room for improvement in this regard.</p> <p>Efforts to point out our holes can’t be done by one person or even a group of people. Instead, we need to harness the power of automation to identify and address our gaps. How can we do that?</p> <h1>Methodology</h1> <p>So, how exactly do we want to go about this? My idea is that we have a weekly job that runs and identifies “misses” across our Databricks workspaces. The job will compile each cluster/job and user/team operating those clusters/jobs into a list. That list will then be emailed or sent to a Slack channel with the data platform team as its recipients. The platform team can then work with the teams to address these pain points.</p> <p><a href="https://medium.com/@matt_weingarten/managing-databricks-at-a-wide-scale-7d04d8d6c284"><strong>Read More</strong></a></p>