Setting a data model in the campaign - 6.3

Talend Data Stewardship Getting Started Guide

author
Talend Documentation Team
EnrichVersion
6.3
task
Data Quality and Preparation > Deduplicating data
EnrichPlatform
Talend Data Stewardship

Campaign owners must select what data model to use in the campaign and decide the read/write access permission per role to each of the attributes in the selected data model.

The data model used in the campaign decides the structure of the data to be managed.

Procedure

  1. In the home page, click Model and select from the model list the data structure you want to use in the campaign.
    The data model Schema for Reconciliation of customer data is created for the Reconciling client data campaign. The model list gives access to all the data models that have been already defined on the Talend Data Stewardship server.
  2. Select the buttons next to each of the attributes in the data structure to set permission per attribute and per data steward and define who can view/edit which attributes.
    Option Description
    gives a read/write access to the attribute in the data model.
    gives only a read access to the attribute in the data model.

    This type of access is useful if the data steward needs to access the information to make a relevant decision but must not change the value, for instance unique identifiers of other elements linked to the entity the steward is viewing, or data that you know is reliable and must not be changed.

    gives no access to the attribute.

    Hiding an attribute is useful if the information is sensitive and should not be visible by the data steward, financial information for instance. Another example of attributes to be hidden is if the information is just noise for the steward, technical identifier for instance, but need to be propagated as part of the task.

    For example, in this campaign you grant a read-only access to the identifier attribute for the campaign participants who have the ACCOUNT ANALYST role. While other participants have a read-write access.
  3. Select a rule from the Survivorship Rule lists next to each of the attributes. These rules are automatically used to decide what attribute values define the master records when loading data into the campaign. Data stewards can then manually modify these choices.
    Option Description
    First not null first source should contains a value, where "first" is defined by the order of the records when the task is created.
    Most common selects the most common attribute value of the duplicates coming from one or more data sources.
    Most recent selects the most recent attribute value of the duplicates coming from one or more data sources. This is based on the metadata of the last update date.
    Most trusted selects the most trusted attribute value of the duplicates as per the trust score you set when creating the campaign or when loading the tasks in the campaign. If no trust score is defined, this option does not work.
    You can select one rule for all the attributes by selecting it from the list next to ALL ATTRIBUTES. If a given algorithm cannot be applied, the rule falls back to First not null. For example, if you do not set a trust score and you select Most trusted during the campaign definition, First not null is used in place. Similarly, First not null is used if you select Most common or Most recent and there are no common or no recent values among the data duplicates.