Setting up the Job - Cloud - 8.0

Excel

Version
Cloud
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for ESB
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > File components (Integration) > Excel components
Data Quality and Preparation > Third-party systems > File components (Integration) > Excel components
Design and Development > Third-party systems > File components (Integration) > Excel components

Procedure

  1. Double-click tFixedFlowInput_1 to open its Basic settings view and do the following.
    1. Open the schema editor by clicking the [...] button next to Edit schema and add three columns in the schema editor: name, type String; city, type String; and quantity, type Integer.
    2. Select Use Inline Content (delimited file) and enter the following in the Content field.
      Jack;Bei Jing;18
      Mary;Shang Hai;20
      Tom;Shen Zhen;25
    3. Leave the other options as they are.

    This component sets the data to be written to the Excel 2007 file.

  2. Double-click tFileOutputExcel_1 to open its Basic settings view and do the following.
    1. Select Write excel2007 file format (xlsx) and enter the path to the target file ("D:/temp/customers.xlsx" in this example) in the File Name field.
      Note:
      • You can also enter the file path by clicking the [...] button next to the File Name field and navigating to the target file or the target folder.
      • The target file will be overwritten if it already exists; otherwise, it will be created.
    2. Select Include header.
    3. Click the [...] button next to Edit schema and check that the component has the same schema as that of tFixedFlowInput_1.

      You can synchronize the schema with that of tFixedFlowInput_1 by clicking Sync columns.

    4. Leave the other options as they are.

    This component writes the data received to the target file.

  3. Double-click tFileInputExcel_1 to open its Basic settings view and do the following.
    1. Select Read excel2007 file format (xlsx) and enter the path to the Excel file to read data from in the File name/Stream field ("D:/temp/customers.xlsx" in this field).
      Note: You can also enter the file path by clicking the [...] button next to the File name/Stream field and navigating to the target file.
    2. Add a row in the Sheet list table by clicking the plus button on the bottom of the table and enter the sheet name ("Sheet1" in this example) in the Sheet column of the row.
      Note: The sheet name needs to be the name of the desired sheet in the specified Excel file.
    3. Enter 1 in the First column field and 4 in the Last column field.
      Note: The number in the Last column field needs to be bigger than the number of the columns used in the Excel file.
    4. Open the schema editor by click the [...] button next to Edit schema.
    5. Add a row in the schema editor by clicking the plus button, type doc in the Column field, select Dynamic from the drop-down list in the Type field, and click OK to close the schema editor.
      Note: The above setting makes the schema a dynamic schema. A schema is a dynamic schema if it contains a column whose Type is set to Dynamic and the column is the only column or the last column in the schema.
    6. Leave the other options as they are.

    This component extracts data from the Excel file using a dynamic schema.

  4. Double-click tLogRow_1 to open its Basic settings view and do the following.
    1. Click the [...] button next to Edit schema and check that the component has the same schema as that of tFileInputExcel_1.

      You can synchronize the schema with that of tFileInputExcel_1 by clicking Sync columns.

    2. Select Table (print values in cells of a table).
    3. Leave the other options as they are.
    This component shows the data that is extracted using a dynamic schema.
  5. Double-click tFileOutputDelimited_1 to open its Basic settings view and do the following.
    1. Enter the path to the file to write the data to ("D:/temp/customers.csv" in this example) in the File Name field.
      Note:
      • You can also enter the file path by clicking the [...] button next to the File Name field and navigating to the target file.
      • The target file will be overwritten if it already exists; otherwise, it will be created.
    2. Click the [...] button next to Edit schema and check that the component has the same schema as that of tLogRow_1.

      You can synchronize the schema with that of tLogRow_1 by clicking Sync columns.

    3. Leave the other options as they are.
    This component writes the received data to a CSV file using dynamic schema.
  6. Double-click tFileInputExcel_2 to open its Basic settings view and do the following.
    1. Select Read excel2007 file format (xlsx) and enter the path to the Excel file to read data from in the File name/Stream field ("D:/temp/customers.xlsx" in this field).
      Note: You can also enter the file path by clicking the [...] button next to the File name/Stream field and navigating to the target file.
    2. Add a row in the Sheet list table by clicking the plus button on the bottom of the table and enter the sheet name ("Sheet1" in this example) in the Sheet column of the row.
      Note: The sheet name needs to be the name of the desired sheet in the specified Excel file.
    3. Enter 1 in the Header field.
    4. Enter 1 in the First column field and 4 in the Last column field.
      Note: The number in the Last column field needs to be bigger than the number of the columns used in the Excel file.
    5. Open the schema editor by click the [...] button next to Edit schema.
    6. Add the same columns as those of tFixedFlowInput_1 in the schema editor and click OK to close the schema editor.
    7. Leave the other options as they are.

    This component extracts data from the Excel file in the normal way.

  7. Double-click tLogRow_2 to open its Basic settings view and do the following.
    1. Click the [...] button next to Edit schema and check that the component has the same schema as that of tFileInputExcel_2.

      You can synchronize the schema with that of tFileInputExcel_2 by clicking Sync columns.

    2. Select Table (print values in cells of a table).
    3. Leave the other options as they are.
    This component shows the data that is extracted in the normal way.