How to define component properties

Talend Data Management Platform Studio User Guide

EnrichVersion
6.2
EnrichProdName
Talend Data Management Platform
task
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

The properties information for each component forming a Job or a subjob allows to set the actual technical implementation of the active Job.

Each component is defined by basic and advanced properties shown respectively on the Basic Settings tab and the Advanced Settings tab of the Component view of the selected component in the design workspace. The Component view gathers also other collateral information related to the component in use, including View and Documentation tabs.

For more an example of a basic Job design, see Getting started with a basic Job.

Basic Settings tab

The Basic Settings tab is part of the Component view, which is located on the lower part of the designing editor of the Integration perspective of Talend Studio.

Each component has specific basic settings according to its function requirements within the Job. For a detailed description of each component properties and use, see Talend Components Reference Guide.

Note

Some components require code to be input or functions to be set. Make sure you use Java code in properties.

For File and Database components, you can centralize properties in metadata files located in the Metadata directory of the Repository tree view. This means that on the Basic Settings tab you can set properties on the spot, using the Built-In Property Type or use the properties you stored in the Metadata Manager using the Repository Property Type. The latter option helps you save time.

Select Repository as Property Type and choose the metadata file holding the relevant information. Related topic: Managing Metadata.

Alternatively, you can drop the Metadata item from the Repository tree view directly to the component already dropped on the design workspace, for its properties to be filled in automatically.

If you selected the Built-in mode and set manually the properties of a component, you can also save those properties as metadata in the Repository. To do so:

  1. Click the floppy disk icon. The metadata creation wizard corresponding to the component opens.

  2. Follow the steps in the wizard. For more information about the creation of metadata items, see Managing Metadata.

  3. The metadata displays under the Metadata node of the Repository.

For all components that handle a data flow (most components), you can define a Talend schema in order to describe and possibly select the data to be processed. Like the Properties data, this schema is either Built-in or stored remotely in the Repository in a metadata file that you created. A detailed description of the Schema setting is provided in the next sections.

How to set a built-in schema

A schema created as Built-in is meant for a single use in a Job, hence cannot be reused in another Job.

Select Built-in in the Property Type list of the Basic settings view, and click the Edit Schema button to create your built-in schema by adding columns and describing their content, according to the input file definition.

Make sure the data type in the Type column is correctly defined.

For more information regarding Java data types, including date pattern, see Java API Specification.

Below are the commonly used Talend data types:

  • Object: a generic Talend data type that allows processing data without regard to its content, for example, a data file not otherwise supported can be processed with a tFileInputRaw component by specifying that it has a data type of Object.

  • List: a space-separated list of primitive type elements in an XML Schema definition, defined using the xsd:list element.

  • Dynamic: a data type that can be set for a single column at the end of a schema to allow processing fields as VARCHAR(100) columns named either as 'Column<X>' or, if the input includes a header, from the column names appearing in the header. For more information, see Dynamic schema.

  • Document: a data type that allows processing an entire XML document without regarding to its content.

In all output properties, you also have to define the schema of the output. To retrieve the schema defined in the input schema, click the Sync columns tab in the Basic settings view.

Warning

When creating a database table, you are recommended to specify the Length field for all columns of type String, Integer or Long and specify the Precision field for all columns of type Double, Float or BigDecimal in the schema of the component used. Otherwise, unexpected errors may occur.

How to set a repository schema

If you often use certain database connections or specific files when creating your data integration Jobs, you can avoid defining the same properties over and over again by creating metadata files and storing them in the Metadata node in the Repository tree view of the Integration perspective.

To recall a metadata file into your current Job, select Repository in the Schema list and then select the relevant metadata file. Or, drop the metadata item from the Repository tree view directly to the component already dropped on the design workspace. Then click Edit Schema to check that the data is appropriate.

You can edit a repository schema used in a Job from the Basic settings view. However, note that the schema hence becomes Built-in in the current Job.

You can also use a repository schema partially. For more information, see How to use a repository schema partially.

Note

You cannot change the schema stored in the repository from this window. To edit the schema stored remotely, right-click it under the Metadata node and select the corresponding edit option (Edit connection or Edit file) from the contextual menu.

Related topics: Managing Metadata.

How to use a repository schema partially

When using a repository schema, if you do not want to use all the predefined columns, you can select particular columns without changing the schema into a built-in one:

The following describes how to use a repository schema partially for a database input component. The procedure may vary slightly according to the component you are using.

  1. Click the [...] button next to Edit schema on the Basic settings tab. The [Edit parameter using repository] dialog box appears. By default, the option View schema is selected.

  2. Click OK. The [Schema] dialog box pops up, which displays all columns in the schema. The Used Column check box before each column name indicates whether the column is used.

  3. Select the columns you want to use.

  4. Click OK. A message box appears, which prompts you to do a guess query.

    Note

    The guess query operation is needed only for the database metadata.

  5. Click OK to close the message box. The [Propagate] dialog box appears. Click Yes to propagate the changes and close the dialog box.

  6. On the Basic settings tab, click Guess Query. The selected column names are displayed in the Query area as expected.

For more information about how to set a repository schema, see How to set a repository schema.

How to set a field dynamically (Ctrl+Space bar)

On any field of your Job/component Properties view, you can use the Ctrl+Space bar to access the global and context variable list and set the relevant field value dynamically.

  1. Place the cursor on any field of the Component view.

  2. Press Ctrl+Space bar to access the proposal list.

  3. Select on the list the relevant parameters you need. Appended to the variable list, a information panel provides details about the selected parameter.

    This can be any parameter including: error messages, number of lines processed, or else... The list varies according to the component in selection or the context you are working in.

    Related topic: Using contexts and variables.

Advanced settings tab

Some components, especially File and Databases components, provides numerous advanced use possibilities.

The content of the Advanced settings tab changes according to the selected component.

Generally you will find on this tab the parameters that are not required for a basic or usual use of the component but may be required for a use out of the standard scope.

How to measure data flows

You can also find in the Advanced settings view the option tStatCatcher Statistics that allows you, if selected, to display logs and statistics about the current Job without using dedicated components. For more information regarding the stats & log features, see How to automate the use of statistics & logs.

Dynamic settings tab

The Basic settings and Advanced settings tabs of all components display various check boxes and drop-down lists for component parameters. Usually, available values for these types of parameters can only be edited when designing your Job.

The Dynamic settings tab, on the Component view, allows you to customize these parameters into code or variable.

This feature allows you, for example, to define these parameters as variables and thus let them become context-dependent, whereas they are not meant to be by default.

Another benefit of this feature is that you can now change the context setting at execution time. This makes full sense when you intend to export your Job in order to deploy it onto a Job execution server for example.