Creating a match analysis - Cloud - 7.3

Talend Studio User Guide

Version
Cloud
7.3
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development
Available in...

Big Data Platform

Cloud API Services Platform

Cloud Big Data Platform

Cloud Data Fabric

Cloud Data Management Platform

Data Fabric

Data Management Platform

Data Services Platform

MDM Platform

Real-Time Big Data Platform

The match analysis enables you to compare a set of columns in databases or in delimited files, and create groups of similar records using blocking and matching keys and survivorship rules.

Before you begin

At least one database or file connection is defined under the Metadata node.

About this task

This analysis enables you to create match rules and test them on data to assess the number of duplicates. You can test match rules only on columns in the same table.

Available in:

Big Data Platform

Data Fabric

Data Management Platform

Data Services Platform

MDM Platform

Real-Time Big Data Platform

Talend DQ Portal is deprecated from Talend 7.1 onwards.

Procedure

  1. Creating the connection to a data source from inside the editor if no connection has been defined under the Metadata folder in the Studio tree view.
    For further information, see Configuring the match analysis.
  2. Defining the table or the group of columns you want to search for similar records using match processes.
  3. Defining blocking keys to reduce the number of pairs that need to be compared.
    For further information, see Defining a match rule.
  4. Defining match keys, the match methods according to which similar records are grouped together. For further information, see Defining a match rule.
  5. Exporting the match rules from the match analysis editor and centralize them in the Studio repository.
    For further information, see Importing or exporting match rules.
  6. Generating reports on the match analyses and save them in a distant database. These reports let you compare current and historical statistics to determine the evolution of data. For more information, see What are reports?.
  7. Available in:

    Big Data Platform

    Data Fabric

    Data Management Platform

    Data Services Platform

    MDM Platform

    Real-Time Big Data Platform

    Access different analytical tools which enable you to explore and monitor the reports generated in the Studio. For more information about the Portal, see the Talend DQ Portal User and Administrator Guide. For more information about installing the Portal, see the Talend Installation and Upgrade Guide.