tUniservBTGeneric - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

tUniservBTGeneric properties

Component family

Data quality

 

Function

tUniservBTGeneric enables the execution of a processing created with the Uniserv product DQ Batch Suite.

Purpose

tUniservBTGeneric sends the data to the DQ Batch Suite and starts the specified DQ Batch Suite job. When the job execution is finished, the results are returned to the Data Quality Service Hub Studio for further processing.

Basic settings

Schema and Edit schema

A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. The schema is either Built-In or stored remotely in the Repository.

Since version 5.6, both the Built-In mode and the Repository mode are available in any of the Talend solutions.

Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available:

  • View schema: choose this option to view the schema only.

  • Change to built-in property: choose this option to change the schema to Built-in for local changes.

  • Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the [Repository Content] window.

Click Sync columns to retrieve the schema from the previous component connected in the Job.

Click Retrieve Schema to create a schema for the components that matches the input and output fields in the DQ Batch Suite job.

 

Host name

Host on which the Master Server of DQ Batch Suite runs, between double quotation marks.

 

Port

Port number on which the DQ Batch Suite server runs, between double quotation marks.

 

Client Server

Name of the client server of the DQ Batch Suite, between double quotation marks.

 

User name

User name for the registration on the DQ Batch Suite server. The stated user must have the right to execute the DQ Batch Suite job.

 

Password

Password of the stated user.

To enter the password, click the [...] button next to the password field, and then in the pop-up dialog box enter the password between double quotes and click OK to save the settings.

 

Job directory

Directory in the DQ Batch Suite, in which the job is saved.

 

Job name

Name of the DQ Batch Suite job that is to be executed.

 

Job file path

File path under which the DQ Batch Suite job to be executed will be saved. The path to the file must be stated absolutely.

Advanced settings

Temporary directory

Directory in which the temporary files created during job execution are to be saved.

 

Input Parameters

These parameters must correspond to the parameters in the function Input (tab "Format") of the DQ Batch Suite job.

File location: State whether the input file is saved in the pool or the local job directory.

Directory: If the File location = Pool, it means the directory is related to the pool directory. If the File location = Job, "input" must be specified here.

File name: Name of the delimiter file which has been generated by tUniservBTGeneric and is to be transferred to the DQ Batch Suite. The file name must correspond to the file name which is defined in the function Input of the DQ Batch Suite job.

No. of header rec.: 0 = no header record, 1 = header record in the input file.

Field separator: Field separator defined in the function Input of the DQ Batch Suite job.

 

Output Parameters

These parameters must correspond to the parameters in the function Output (tab "Format") of the DQ Batch Suite job.

File location: State whether the output file is to be saved in the pool or the local job directory.

Directory: If the File location = Pool, it means the directory is related to the pool directory. If the File location = Job, "output" must be specified here.

File name: Name of the output file in the delimiter format, which is created by the DQ Batch Suite job. The file name must correspond to the file name defined in the function Output of the DQ Batch Suite job.

No. of header rec.: 0 = no header record, 1 = header record in the output file.

Field separator: Field separator defined in the function Output of the DQ Batch Suite job.

Global Variables

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

tUniservBTGeneric sends data to DQ Batch Suite and starts the specified DQ Batch Suite job. When the execution is finished, the output data of the job is returned to Data Quality Service Hub Studio for further processing.

Limitation

To use tUniservBTGeneric, the Uniserv software DQ Batch Suite must be installed.

Note

Please note the following:

  • The job must be configured and executable in the DQ Batch Suite.

  • The user must have the authority to execute the DQ Batch Suite job.

  • The DQ Batch Suite job may only have one line.

  • The files defined in the functions Input and Output must possess the record format delimiter.

  • Input and output data must be provided in the UTF-8 character set.

 

Scenario: Execution of a Job in the Data Quality Service Hub Studio

This scenario describes a DQ Batch Suite job which execution results are processed in the Data Quality Service Hub Studio. The input source for the job is provided by the Data Quality Service Hub Studio.

The job was completely defined in the DQ Batch Suite and saved under the name "BTGeneric_Sample". In the function Input, the file "btinput.csv" was specified as the input file saved in the job directory and all fields were assigned. The file is not yet existent physically as it will only be provided by the Data Quality Service Hub Studio, so that the job cannot yet run.

In the Data Quality Service Hub Studio, the input source (here a table from an Oracle database) for this scenario was already saved in the Repository, so that all schema metadata is available.

  1. In the Repository view, expand the Metadata node and the directory in which you saved the source. Then drag this source into the design workspace.

    The dialog box below appears.

  2. Select tOracleInput and then click OK to close the dialog box.

    The component is displayed in the workspace. The table used in this scenario is called LOCATIONS.

  3. Drag the following components from the Palette into the design workspace: two tMap components, tOracleOutput and tUniservBTGeneric.

  4. Connect tMap with tUniservBTGeneric first.

    Accept the schema from tUniservBTGeneric by clicking Yes on the prompt window.

  5. Connect the other components via the Row > Main link.

  6. Double-click tUniservBTGeneric to open its Basic Settings view.

  7. Enter the connection data for the DQ Batch Suite job. Note that the absolute path must be entered in the field Job File Path.

  8. Click Retrieve Schema to automatically create a schema for tUniservBTGeneric from the input and output definitions of the DQ Batch Suite job and automatically fill in the fields in the Advanced Settings.

  9. Check the details in the Advanced Settings view. The definitions for input and output must be defined exactly the same as the DQ Batch Suite job. If necessary, adapt the path for the temporary files.

  10. Double-click tMap_1 to open the schema mapping window. On the left is the structure of the input source, on the right is the schema of tUniservBTGeneric (and thus the input for the DQ Batch Suite job). At the bottom is the Schema Editor, where you can find the attributes of the individual columns and edit them.

  11. Assign the columns of the input source to the respective columns of tUniservBTGeneric. For this purpose, select a column of the input source and drag it onto the appropriate column on the right side.

    Click OK to close the dialog box.

  12. Then define how to process the execution results of the job, including which components will be used.

  13. Before starting the Job, make sure that all path details are correct, the server of the DQ Batch Suite is running and that you are able to access the job.