Handling grouping tasks to decide on relationship among pairs of records - 8.0

Talend Data Stewardship Examples

Version
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Data Stewardship
Content
Data Governance > Assigning tasks
Data Governance > Managing campaigns
Data Governance > Managing data models
Data Quality and Preparation > Handling tasks
Last publication date
2023-09-19
Grouping tasks consist in deciding on a relationship between several records in a group. Once you validate your choice, you transition the task to the second state defined in the workflow.

Procedure

  1. On the Tasks page, click the campaign name, Site deduplication in this example, to open a list of the tasks assigned to you.

    Example

    You need to answer a question to confirm if suspect pairs from a list of childhood education centers are real duplicates. Once you label the records and validate your choice, a Talend Job retrieves the data from the campaign and uses it in the context of matching data on Spark.
  2. Select one task or use the Ctrl / Shift key to select multiple tasks and click Yes, No or Not sure to confirm the relationship between data pairs.
    Tasks are tagged with green to show that a decision has been taken on them and your choice is listed in the Arbitration column.
  3. Click Validate choices in the top-right corner of the page to validate the choices you have done on the tasks.

Results

Choices are set, data records are resolved, validated and moved from your list.

What to do next

Use a Talend Job to analyze the data labeled in the Site deduplication campaign and generate a matching model.

For further information, see the Job about generating a matching model from a Grouping campaign from the matching with machine learning scenarios.