Configuring the third Job - 7.0

Big Data Job Examples

author
Talend Documentation Team
EnrichVersion
7.0
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Open Studio for Big Data
Talend Real-Time Big Data Platform
task
Design and Development > Designing Jobs
Design and Development > Designing Jobs > Hadoop distributions
Design and Development > Designing Jobs > Job Frameworks > Standard
EnrichPlatform
Talend Studio
In this step, we will configure the third Job, C_HCatalog_Read, to check the content of the log uploaded to the HCatalog.

Procedure

  1. Double-click the tHCatalogInput component to open its Basic settings view in the Component tab.
  2. Click the Property Type list box and select Repository, and then click the [...] button to open the [Repository Content] dialog box to use a centralized HCatalog connection.
  3. Select the HCatalog connection defined for connecting to the HCatalog database and click OK.

    All the connection details are automatically filled in the respective fields.

  4. Click the Schema list box and select Repository, then click the [...] button next to the field that appears to open the [Repository Content] dialog box, expand Metadata > Generic schemas > access_log and select schema. Click OK to confirm your select and close the dialog box. The generic schema of access_log is automatically applied to the component.

    Alternatively, you can directly select the generic schema of access_log from the Repository tree view and then drag and drop it onto this component to apply the schema.

  5. In the Basic settings view of the tLogRow component, select the Vertical mode to display the each row in a key-value manner when the Job is executed.
  6. Upon completion of the component settings, press Ctrl+S to save your Job configurations.