In the staging data container browser, Talend Studio allows you to simulate the
matching of staging data records retrieved from a specific data container and check the
match result. If they do match, you can check the match details.
For more information about data containers and how to browse a data container, see Data Containers.
Before you begin
A match rule has been defined and attached to a data model. The match rule and the
data model have already been deployed to the MDM server.
Note: The match simulation
operations will not take into account the built-in blocking key which you can select
to use when defining the match rule.
For more information about how to
attach a match rule to a data model, see Attaching a Match Rule to a Data Model.
About this task
To simulate a match operation on staging data records, do the following:
Procedure
-
In the MDM Repository tree view, expand the
Data Container node.
-
Double-click the data container from which you want to run the match
simulation to open the data container editor.
-
Click the Staging Data Container tab to open
the staging data container view.
-
Click the
icon to retrieve data records of all entities.
You can define criteria to narrow down the data records you want to retrieve.
For more information about how to browse a data container, see
Browsing a data container.
-
Select more than one record belonging to the same entity, right-click the
selected records at the same time, and then select Simulate Match from the contextual menu.
-
The Match Result dialog box opens,
demonstrating the match result of the selected data records.
If a data record does not match any of other data records, a separate group
will be created for the data record.
The match result includes the following information:
-
GRP_SIZE: Indicates the number of
similar staging data records which are grouped together.
-
CONFIDENCE: Indicates the confidence
score computed by normalizing all match scores to a value between 0 and
1 with a weighted match score.
-
SCORE: Indicates, in the form of a
percentage accurate to two decimal places, the consolidated match score
(that is, how similar two or more data records are) computed based on
all match keys defined in the match rule attached to the data model
entity. You can move your mouse over the score to view the score
expressed as a decimal.
-
ATTR_SCORE: Indicates, in the form of
a percentage accurate to two decimal places, the match score computed
based on each match key defined in the match rule attached to the data
model entity. You can move your mouse over the score to view the score
expressed as a decimal.
-
You can also simulate the match operation on customized data records. To do
that, click the Edit Records button to open the
Edit Records dialog box.
Review the data records and edit them according to your needs. Then, click the
Rerun Simulation button and check the newly
simulated match result in the Match Result
dialog box.
-
If needed, you can click the DETAILS field in
the first row and then click the [...] button
to open the Match Detail dialog box, which
gives details about how those data records are matched.
After checking the match details, click OK to
close the dialog box.
-
Once you are done with the match simulation operation, click OK to close the Match
Result dialog box.