Create the second Job - 7.0

Big Data Job Examples

author
Talend Documentation Team
EnrichVersion
7.0
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Open Studio for Big Data
Talend Real-Time Big Data Platform
task
Design and Development > Designing Jobs
Design and Development > Designing Jobs > Hadoop distributions
Design and Development > Designing Jobs > Job Frameworks > Standard
EnrichPlatform
Talend Studio
Follow these steps to create the second Job, which will upload the access log file to the HCatalog:

Procedure

  1. Create a new Job and name it B_HCatalog_Load to identify its role and execution order among the example Jobs.
  2. From the Palette, drop a tApacheLogInput, a tFilterRow, a tHCatalogOutput, and a tLogRow component onto the design workspace.
  3. Connect the tApacheLogInput component to the tFilterRow component using a Row > Main connection, and then connect the tFilterRow component to the tHCatalogOutput component using a Row > Filter connection.

    This data flow will load the log file to be analyzed to the HCatalog database, with any records having the error code of "301" removed.

  4. Connect the tFilterRow component to the tLogRow component using a Row > Reject connection.

    This flow will print the records with the error code of "301" on the console.

  5. Label these components to better identify their functionality.