What is Encryption
Encryption is a data security measure used in encoding information by converting plaintext to ciphertext using a key.
What is Encryption Key
In Cryptography, encryption is a form of encoding information.
Encryption key is an important data security and cryptography component in form of string of bits created for scrambling and unscrambling data.
What is Decryption
Decryption is a process of converting encrypted data back to it’s original unencrypted form.
To install cryptography packages in python
pip install cryptography
Import some modules related to Cryptography
Generate Key for File Encryption
The code above will generate a key that look like this: /tmp/tmpp7yct_de
Note: If you do not have have WRITE access in databricks DBFS, create a temporary file as shown above to store key (This if you have only READ access).
Code to access DBFS file path
To specify Correct File Path (Add this if needed)
Encrypt file using the key generated
List File Directory (Add code if needed)
Decrypt the encrypted file
Parameterization – To Parameterize variables above (File_Name, File_Path and Encryption Key)
What is Parameterization in Python
Parameterization is an efficient method of reusing code and reducing code duplication.
To install cryptography packages in python
pip install cryptography
Import some modules related to Cryptography
Generate File Key
Code to access DBFS path
Specify the correct file path and name
Define File_Path, File_Name and Encryption Key
Encrypt File
Decrypt File
To see Encrypted File
print(encrypted_data)
To see Decrypted File
print(decrypted_data)
In summary, encrypting and decrypting file is a process and you need a key to decrypt and encrypt a file. Using PySpark code above in Databricks will be helpful to get this task done. You can read up difference between python and pyspark code.