In this example, you create a data model to determine the structure of the data
to be managed in the CRM Data Deduplication campaign which you
create to allow data stewards to merge duplicate customer data stored in the enterprise
CRM.
Talend Data Stewardship has data model awareness
which makes possible the syntactic and semantic validation of data. You can define the
attributes in the data model and select their types out of a predefined standard or
semantic types.
Procedure
-
In the home page, click .
-
Enter a name and a description for the new model.
-
In the Attributes section, define the columns you want
to have in the data model as the following:
-
In the IDENTIFIER field, enter the technical
identifier for the first column.
-
Enter a name and a description for the column in the corresponding
fields, if needed.
What you set in the NAME field is the name
displayed in the task list. If no name is set, the technical identifier
will be displayed.
-
From the attribute type list, select the type of the column.
Standard and semantic types are integrated in
Talend Data Stewardship by
default
- For standard types, additional fields are displayed or hidden
according to the type you select. These fields are optional and they
enable you to define some constraints on the attribute you define
such as defining a minimum and/or maximum length or defining a
pattern against which to validate the attribute.
- For semantic types, you can use the Talend Dictionary Service to
manage the semantic types. However, the availability of this service
depends on the license you have.
-
Optionally, click the switch next to ALLOW EMPTY VALUES
to disable the load of empty fields to Talend Data Stewardship. This option is enabled
by default.
-
Click ADD ATTRIBUTE in
the left panel and repeat the above steps to create all the columns you need in
the data model.
The columns defined for the CRM Data Deduplication
campaign include information about the customers and the company in which they
work as shown in the capture.