Tag: Delta

Dev versus Delta: Demystifying engineering roles at Palantir

As a Hiring Manager for engineers, I get a lot of questions about our unique engineering model. Specifically, people want to know: what’s the difference between a Software Engineer and a Forward Deployed Software Engineer? I polled Palantirians from around the globe for their an...

Efficient Change Data Capture (CDC) on Databricks Delta Tables with Spark

In today’s data-driven applications, organizations face a critical challenge: ensuring near-real-time data aggregation and accuracy for display on dashboards. As businesses integrate larger and more complex datasets from various sources, including streaming data from Kafka streams, they encoun...

Optimize your Delta Tables & ETLs with Change Data Feed (CDF) in Databricks

After explaining what Delta Live Tables are and then going in depth on how we can record data source changes of those tables with Change Data Capture (CDC), there is yet another useful feature for your Delta Tables called Change Data Feed or CDF. This feature will record cha...

Delta Live Tables : Simplify the ETL Process

Databricks Delta Live Tables provide one of the key solution to build and manage, reliable and robust data engineering pipelines that can load the Streaming and batch data and deliver high-quality data on the Lakehouse Platform. DLT not only helps an engineer to simplify the ETL development ...

Understanding Delta Tables Constraints

Delta Lake, an open-source storage layer that brings reliability to data lakes, allows you to store and manage data in data lakes. Delta tables are a core concept of Delta Lake, which enables data versioning, transactional reads and writes, schema enforcement, and metadata management. In this articl...

Building an End-to-End Data Pipeline with Delta Lake and Databricks

Introduction In this article, we will walk through the process of building a data pipeline using Delta Lake and Databricks. We will use COVID-19 data for the USA, available on Kaggle, as our dataset. This pipeline will demonstrate how to ingest raw data, clean and transform it, and finally v...

5 reasons to choose Delta format (on Databricks)

In this blog post, I will explain 5 reasons to prefer the Delta format to parquet or ORC when you are using Databricks for your analytic workloads. But before we start, let’s have a look at what is delta format. Delta … an introduction Delta is a data format based on Apache Par...

How Delta Sharing Works

Delta Sharing is an open standard for secure data sharing. Having worked as a data solution architect with UK government departments over the years, one of the challenges that frequently cropped up was how to securely share data between departments. Delta Sharing would have been a straightforward an...

Solving Delta Table Concurrency Issues: Practical Code Solutions & Insights

Delta Lake is a powerful technology for bringing ACID transactions to your data lakes. It allows multiple operations to be performed on a dataset concurrently. However, dealing with concurrent operations can sometimes be tricky and may lead to issues such as `ConcurrentAppendException`, `ConcurrentD...

Unlocking Performance: Optimize, Vacuum, and Z-Ordering in Databricks’ Delta Tables

Delta Lake is a powerful storage layer that brings ACID transactions to Apache Spark and big data workloads. Delta Lake not only enhances reliability but also introduces various optimization techniques that can significantly boost performance and streamline data workflows. In this article, we wil...

Unlocking Performance: Optimize, Vacuum, and Z-Ordering in Databricks’ Delta Tables

Delta Lake is a powerful storage layer that brings ACID transactions to Apache Spark and big data workloads. Delta Lake not only enhances reliability but also introduces various optimization techniques that can significantly boost performance and streamline data workflows. In this article, we wil...

Databricks Delta Lake Tables: Managed vs Unmanaged

Delta Lake is a powerful storage layer for big data processing workloads in Databricks. In the previous article, we discussed Delta Lake on Databricks: Python Installation and Setup Guide When working with Delta Lake tables, you can choose between two types of tables: managed and unmanaged. In...

Mist on the Water: The crash of Delta flight 723

On the 31st of July 1973, a Delta Air Lines DC-9 on approach to Boston, Massachusetts slammed into a seawall at the foot of the runway, spewing burning wreckage across the airport and killing 88 of the 89 people on board. The lone survivor was Leopold Chouinard, who clung to life despite severe inju...

Argentine Asado in the Tigre Delta

We piled into the dinghy and braced ourselves. The marina behind us quickly disappeared as Frederic ripped through the water weaving between other boats, bouncing over their wake. The canals narrowed. Stilted homes appeared, each with its own dock. We slowed into the creamy tea-colored waters tinted...

A Trip Down The Tigre Delta, Buenos Aires

Tigre is a natural tourist attraction at the mouth of the Parana River Delta. With a combination of typical and picturesque river houses mixed with wild landscapes. The area is perfect for those who love nature (not me really) and bird watching (still not me) or spending the day on a river boat with...

Flying High with AI: Counting Pelican Breeding Pairs in the Danube Delta

Imagine flying in a small airplane over the vast wetlands of the Danube Delta on the shores of the Black Sea in Romania looking for patches of small white dots: great white pelicans (Pelecanus onocrotalus). While flying over the colonies researchers like Sebastian Bugariu from the Romanian Ornitholo...

How are Delta-8 THC and Delta-9 THC Produced?

Delta-8-tetrahydrocannabinol (delta-8 THC) is a cannabinoid naturally found in very low concentrations in cannabis plants. However, it can also be produced through chemical conversion or extraction processes, typically from cannabidiol (CBD), which is more abundant in hemp plants. There are vario...