Step 3: Reference file definition, remapping, inner join mode selection - 6.5

Data Integrationジョブの例

author
Talend Documentation Team
EnrichVersion
6.5
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
ジョブデザインと開発 > ジョブデザイン
EnrichPlatform
Talend Studio

Procedure

  1. Define the Metadata corresponding to the LosAngelesandOrangeCounties.txt file just the way we did it previously for California_clients file, using the wizard.

    At Step1 of the wizard, name this metadata entry: LA_Orange_cities.

  2. Then drop this newly created metadata to the top of the design area to create automatically a reading component pointing to this metadata.
  3. Then link this component to the tMap component.
  4. Double-click again on the tMap component to open its interface. Note that the reference input table (row2) corresponding to the LA and Orange county file, shows to the left of the window, right under your main input (row1).
  5. Now let's define the join between the main flow and the reference flow.

    In this use case, the join is pretty basic to define as the City column is present in both files and the data match perfectly. But even though this was not the case, we could have carried out operations directly at this level to establish a link among the data (padding, case change...)

    To implement the join, drop the City column from your first input table onto the City column of your reference table. A violet link then displays, to materialize this join.

    Now, we are able to use the County column from the reference table in the output table (out1).

  6. Eventually, click the OK button to validate your changes, and run the new Job.

    The following output should display on the console.

    As you can notice, the last column is only filled out for Los Angeles and Orange counties' cities. For all other lines, this column is empty. The reason for this is that by default, the tMap implements a left outer join mode. If you want to filter your data to only display lines for which a match is found by the tMap, then open again the tMap, click the tMap settings button and select the Inner Join in the Join Model list on the reference table (row2).