tDataEncrypt - 7.3

Data privacy

Version
7.3
Language
English (United States)
Product
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Data Quality components > Data privacy components
Data Quality and Preparation > Third-party systems > Data Quality components > Data privacy components
Design and Development > Third-party systems > Data Quality components > Data privacy components

Protects data by transforming it into unreadable cipher text.

Only users with the user-defined password and the cryptographic file can decrypt this cipher text and read the original data.

Note: The minimum required Java version for this component is Java 8u161.

In local mode, Apache Spark 1.6 and later versions are supported.

For more technologies supported by Talend, see Talend components.

Why encrypting data?

Encryption is used to protect your assets, your organization, or customers' sensitive data. Encryption can protect data from internal or external leakage.

In Big Data environments, large volumes of data from many sources are collected, manipulated and stored in various formats. Encryption helps reduce the risk of sensitive data exposure.

Encryption is also recommended or required for compliance with data protection laws.

Considerations for data encryption

  • Define the type of the data to be protected: data in transit or data at rest.
  • Identify the scope of the data to be protected: purpose, ownership, access, etc.
  • Provide strong passwords for the cryptographic file.
  • Do not reuse passwords for different data encryption operations.
  • Store passwords in a secure password management system.
  • Make sure only authorized users get access to the password and the cryptographic file necessary to decrypt back data.
  • Strong encryption methods generally increase required resources.
  • Separate the cryptographic file from the encrypted data to keep your data secure.
  • It is advised to use different cryptographic files to encrypt different datasets.
  • Data encryption is not a complete security approach. Combining different security layers help address concerns about sensitive data. Security layers include vulnerability assessment and management or anti-malware solutions.

Data encryption methods

The tDataEncrypt component encrypts data using the AES-GCM and Blowfish encryption methods:
AES-256 Blowfish
GCM mode of operation CBC mode of operation
Uses a randomly generated 256-bit key Uses a randomly generated 256-bit key
Integrity check No integrity check
Faster on modern CPUs Computationally faster
Patented Unpatented
Standardized by the National Institute of Standards and Technology (NIST) -
Used by SSL/TLS -

The data encryption process

The data encryption process includes the following steps:
  1. Generating the cryptographic file. It contains:
    • A randomly generated salt used to derive a cryptographic key from the user-defined password using the PBKDF2 key derivation function.
    • A randomly generated 256-bit key encrypted with AES and the user-defined password.
    • The encryption method encrypted with AES and the user-defined password.
  2. Accessing the encrypted data from the cryptographic file by:
    1. Using the randomly generated salt to derive the cryptographic key from the user-defined password.
    2. Using the cryptographic key and the AES method to decrypt the randomly generated 256-bit key and the encryption method.

    During the decryption, if the password is correct, the component can access the encryption method and the randomly generated 256-bit key. Otherwise, the access is denied.

  3. Encrypting the data using:
    • The randomly generated 256-bit key from the cryptographic file
    • The encryption method
    • A random initialization vector (IV) generated for each data