Defining a data model for the campaign - 6.3

Talend Data Stewardship Getting Started Guide

author
Talend Documentation Team
EnrichVersion
6.3
task
Data Quality and Preparation > Deduplicating data
EnrichPlatform
Talend Data Stewardship

Talend Data Stewardship has data model awareness which makes syntactic and semantic validation of data possible. Therefore, a campaign relies on a data model to ensure that the data matches the expected structure and format.

By defining the structure of data for this Merging campaign, you define the attributes in the data model and select their types out of a predefined standard or semantic types.

In this example, the data structure represents information about redundant client data coming from different systems. This data model will be used in the Reconciling client data campaign which is set to merge the client duplicate records.

Before you begin

  • An administrator has created users and assigned them roles in Talend Administration Center.

    For further information, see Creating Data Stewardship users.

  • You have been assigned a campaign owner role in Talend Administration Center.

  • You have accessed Talend Data Stewardship as a campaign owner.

Procedure

  1. In the home page, click MY DATA MODELS > ADD DATA MODEL.
  2. Enter a name and a description for the new model in the Name and Description fields respectively.

    Optional fields are marked as optional next to their names

  3. In the Attributes section, define the columns you want to have in the data model as the following:
    1. In the IDENTIFIER field, enter the technical identifier for the first column.
    2. Enter a name and a description for the column in the corresponding fields, if needed.
      What you set in the NAME field is the name displayed in the task list. If no name is set, the technical identifier will be displayed.
    3. From the attribute type list, select the type of the column.
      Standard and semantic types are integrated in Talend Data Stewardship by default
      • For standard types, additional fields are displayed or hidden according to the type you select. These fields are optional and they enable you to define some constraints on the attribute you define such as defining a minimum and/or maximum length or defining a pattern against which to validate the attribute.
      • For semantic types, you can use the Talend Dictionary Service to manage the semantic types. However, the availability of this service depends on the license you have.
  4. Click the switch next to DEFINE A LIST OF VALUES to display fields where you can set specific values for the attribute.
    Any values that are not in this list are marked as invalid in the task list.
  5. Click the switch next to ALLOW EMPTY VALUES to disable the load of empty fields to Talend Data Stewardship, if needed. This option is enabled by default.
  6. Click ADD ATTRIBUTE in the left panel and repeat the above steps to create all the columns you need in the data model.