5 reasons to choose Delta format (on Databricks)

In this blog post, I will explain 5 reasons to prefer the Delta format to parquet or ORC when you are using Databricks for your analytic workloads. But before we start, let’s have a look at what is delta format. <h1>Delta … an introduction</h1> Delta is a data format based on Apache Parquet. It’s an open source project (<a href="https://github.com/delta-io/delta" rel="noopener ugc nofollow" target="_blank">https://github.com/delta-io/delta</a>), delivered with Databricks runtimes and it’s the default table format from runtimes 8.0 onwards. You can use Delta format through notebooks and applications executed in Databricks with various APIs (<a href="http://spark.apache.org/docs/latest/api/python/index.html" rel="noopener ugc nofollow" target="_blank">Python</a>, <a href="http://spark.apache.org/docs/latest/api/scala/index.html" rel="noopener ugc nofollow" target="_blank">Scala</a>, <a href="https://spark.apache.org/docs/latest/api/sql/index.html" rel="noopener ugc nofollow" target="_blank">SQL </a>etc.) and also with Databricks SQL. As said above, Delta is made of many components: <a href="https://medium.com/datalex/5-reasons-to-use-delta-lake-format-on-databricks-d9e76cf3e77d">Read More</a>