Standardizing data - Cloud - 8.0

Data matching with Talend tools

Version
Cloud
8.0
Language
English
Product
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Data Quality components > Matching components > Continuous matching components
Data Governance > Third-party systems > Data Quality components > Matching components > Data matching components
Data Governance > Third-party systems > Data Quality components > Matching components > Fuzzy matching components
Data Governance > Third-party systems > Data Quality components > Matching components > Matching with machine learning components
Data Quality and Preparation > Third-party systems > Data Quality components > Matching components > Continuous matching components
Data Quality and Preparation > Third-party systems > Data Quality components > Matching components > Data matching components
Data Quality and Preparation > Third-party systems > Data Quality components > Matching components > Fuzzy matching components
Data Quality and Preparation > Third-party systems > Data Quality components > Matching components > Matching with machine learning components
Design and Development > Third-party systems > Data Quality components > Matching components > Continuous matching components
Design and Development > Third-party systems > Data Quality components > Matching components > Data matching components
Design and Development > Third-party systems > Data Quality components > Matching components > Fuzzy matching components
Design and Development > Third-party systems > Data Quality components > Matching components > Matching with machine learning components
Last publication date
2024-02-06
Standardizing data before trying to perform matching tasks is an essential step to improve matching accuracy.
Talend provides different ways to standardize data:
  • You can standardize data against indices. Synonyms are standardized or converted to the "master" words.

    For more information on available data synonym dictionaries, see the Data synonym dictionaries.

  • You can use address validation components to standardize address data against Experian QAS, Loqate and MelissaData validation tools. The addresses returned by these tools are consistent and variations in address representations are eliminated. As addresses are standardized, matching gets easier.

    For more information on the tQASBatchAddressRow, tLoqateAddressRow and tMelissaDataAddress components, see Address standardization.

  • You can use the tStandardizePhoneNumber component to standardize a phone number, based on the formatting convention of the country of origin.

    For more information on phone number standardization, see Phone number standardization.

  • You can use other more generic components to transform your data and get more standardized records, such as tReplace, tReplaceList, tVerifyEmail, tExtractRegexFields or tMap.