Feature | Description |
---|---|
Cross-column functions | The introduction of functions applicable to multiple
columns at once (such as concatenation and maths operations) brings improved
efficiency for dataset cleansing and standardization. |
Extract part of a name | It is now possible, by leveraging a machine-learning
model, to split a full name into its respective subparts such as title,
first name, middle name, last name, and suffix, thus increasing efficiency
for dataset cleansing and standardization. |
Extract parts of a field based on semantic definitions | It is now possible, leveraging the definition of
semantic types, to extract various types of information contained in a
single cell, into individual columns, thus increasing efficiency for dataset
cleansing and standardization. |
Repeatable masking and compound semantic types masking | Data masking has been improved and can now handle
seeds, to offer repeatable masking. Which means that identical source values
will always be output as the same masked values. In addition, semantic masking can now be performed on compound semantic types, enhancing data privacy. |
Convert character width | You can now use this function to convert the character
width to half or full width, and even normalize strings in your
datasets. |
Coalesce columns | This function can be used to easily retrieve the first non null value
across different columns to consolidate their data into a new column. |
Known issues: https://jira.talendforge.org/issues/?filter=26475
Get started with Talend Cloud Data Preparation on this page.