Column Encryption and Decryption in Databricks

Encryption of data is always important, particularly when certain fields in the data contain sensitive information.

Databricks provides the aes_encrypt and aes_decrypt functions to help accomplish this. The documentation for those functions can be found here:

aes_encrypt function — Azure Databricks — Databricks SQL | Microsoft Learn

aes_decrypt function — Azure Databricks — Databricks SQL | Microsoft Learn

The Apache Spark documentation for these Spark SQL functions can be found here:

Spark SQL, Built-in Functions (apache.org)

However, as you can see from the examples in their documentation, these functions are more designed for the Spark SQL functionality. I wanted a solution that allowed me to use PySpark to encrypt and decrypt columns in DataFrames. However, per this documentation…

Learn More