Scenario: Execution of a Job in the Data Quality Service Hub Studio - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

This scenario describes a DQ Batch Suite job which execution results are processed in the Data Quality Service Hub Studio. The input source for the job is provided by the Data Quality Service Hub Studio.

The job was completely defined in the DQ Batch Suite and saved under the name "BTGeneric_Sample". In the function Input, the file "btinput.csv" was specified as the input file saved in the job directory and all fields were assigned. The file is not yet existent physically as it will only be provided by the Data Quality Service Hub Studio, so that the job cannot yet run.

In the Data Quality Service Hub Studio, the input source (here a table from an Oracle database) for this scenario was already saved in the Repository, so that all schema metadata is available.

  1. In the Repository view, expand the Metadata node and the directory in which you saved the source. Then drag this source into the design workspace.

    The dialog box below appears.

  2. Select tOracleInput and then click OK to close the dialog box.

    The component is displayed in the workspace. The table used in this scenario is called LOCATIONS.

  3. Drag the following components from the Palette into the design workspace: two tMap components, tOracleOutput and tUniservBTGeneric.

  4. Connect tMap with tUniservBTGeneric first.

    Accept the schema from tUniservBTGeneric by clicking Yes on the prompt window.

  5. Connect the other components via the Row > Main link.

  6. Double-click tUniservBTGeneric to open its Basic Settings view.

  7. Enter the connection data for the DQ Batch Suite job. Note that the absolute path must be entered in the field Job File Path.

  8. Click Retrieve Schema to automatically create a schema for tUniservBTGeneric from the input and output definitions of the DQ Batch Suite job and automatically fill in the fields in the Advanced Settings.

  9. Check the details in the Advanced Settings view. The definitions for input and output must be defined exactly the same as the DQ Batch Suite job. If necessary, adapt the path for the temporary files.

  10. Double-click tMap_1 to open the schema mapping window. On the left is the structure of the input source, on the right is the schema of tUniservBTGeneric (and thus the input for the DQ Batch Suite job). At the bottom is the Schema Editor, where you can find the attributes of the individual columns and edit them.

  11. Assign the columns of the input source to the respective columns of tUniservBTGeneric. For this purpose, select a column of the input source and drag it onto the appropriate column on the right side.

    Click OK to close the dialog box.

  12. Then define how to process the execution results of the job, including which components will be used.

  13. Before starting the Job, make sure that all path details are correct, the server of the DQ Batch Suite is running and that you are able to access the job.