Configuring the components - 7.3

MS Delimited

Version
7.3
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for ESB
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > File components (Integration) > MS Delimited components
Data Quality and Preparation > Third-party systems > File components (Integration) > MS Delimited components
Design and Development > Third-party systems > File components (Integration) > MS Delimited components

Procedure

  1. Double-click tFileInputMSDelimited to open the Multi Schema Editor.
  2. Click Browse... next to the File name field to locate the multi schema delimited file you need to process.
  3. In the File Settings area:
    -Select from the list the encoding type the source file is encoded in. This setting is meant to ensure encoding consistency throughout all input and output files.
    -Select the field and row separators used in the source file.
    Note:

    Select the Use Multiple Separator check box and define the fields that follow accordingly if different field separators are used to separate schemas in the source file.

    A preview of the source file data displays automatically in the Preview panel.
    Note:

    Column 0 that usually holds the record type indicator is selected by default. However, you can select the check box of any of the other columns to define it as a primary key.

  4. Click Fetch Codes to the right of the Preview panel to list the type of schema and records you have in the source file. In this scenario, the source file has three schema types (A, B, C).
    Click each schema type in the Fetch Codes panel to display its data structure below the Preview panel.
  5. Click in the name cells and set column names for each of the selected schema.
    In this scenario, column names read as the following:
    -Schema A: Type, DiscName, Author, Date,
    -Schema B: Type, SongName,
    -Schema C: Type, LibraryName.
    You need now to set the primary key from the incoming data to insure its unicity (DiscName in this scenario). To do that:
  6. In the Fetch Codes panel, select the schema holding the column you want to set as the primary key (schema A in this scenario) to display its data structure.
  7. Click in the Key cell that corresponds to the DiscName column and select the check box that appears.
  8. Click anywhere in the editor and the false in the Key cell will become true.
    You need now to declare the parent schema by which you want to group the other "children" schemas (DiscName in this scenario). To do that:
  9. In the Fetch Codes panel, select schema B and click the right arrow button to move it to the right. Then, do the same with schema C.
    Note:

    The Cardinality field is not compulsory. It helps you to define the number (or range) of fields in "children" schemas attached to the parent schema. However, if you set the wrong number or range and try to execute the Job, an error message will display.

  10. In the Multi Schema Editor, click OK to validate all the changes you did and close the editor.
    The three defined schemas along with the corresponding record types and field separators display automatically in the Basic settings view of tFileInputMSDelimited.
    The three schemas you defined in the Multi Schema Editor are automatically passed to the three tLogRow components.
  11. If needed, click the Edit schema button in the Basic settings view of each of the tLogRow components to view the input and output data structures you defined in the Multi Schema Editor or to modify them.