5 reasons to choose Delta format (on Databricks)

<p>In this blog post, I will explain 5 reasons to prefer the Delta format to parquet or ORC when you are using Databricks for your analytic workloads.</p> <p>But before we start, let&rsquo;s have a look at what is delta format.</p> <h1>Delta &hellip; an introduction</h1> <p>Delta is a data format based on Apache Parquet. It&rsquo;s an open source project (<a href="https://github.com/delta-io/delta" rel="noopener ugc nofollow" target="_blank">https://github.com/delta-io/delta</a>), delivered with Databricks runtimes and it&rsquo;s the default table format from runtimes 8.0 onwards.</p> <p>You can use Delta format through notebooks and applications executed in Databricks with various APIs (<a href="http://spark.apache.org/docs/latest/api/python/index.html" rel="noopener ugc nofollow" target="_blank">Python</a>,&nbsp;<a href="http://spark.apache.org/docs/latest/api/scala/index.html" rel="noopener ugc nofollow" target="_blank">Scala</a>,&nbsp;<a href="https://spark.apache.org/docs/latest/api/sql/index.html" rel="noopener ugc nofollow" target="_blank">SQL&nbsp;</a>etc.) and also with Databricks SQL.</p> <p>As said above, Delta is made of many components:</p> <p><a href="https://medium.com/datalex/5-reasons-to-use-delta-lake-format-on-databricks-d9e76cf3e77d"><strong>Read More</strong></a></p>
Tags: Delta format