How to create a custom component

EnrichVersion
6.4
6.3
6.2
6.1
6.0
5.6
EnrichProdName
Talend Data Fabric
Talend Data Management Platform
Talend Open Studio for Big Data
Talend Big Data
Talend Data Integration
Talend Real-Time Big Data Platform
Talend ESB
Talend Data Services Platform
Talend Open Studio for ESB
Talend MDM Platform
Talend Open Studio for MDM
Talend Big Data Platform
Talend Open Studio for Data Integration
task
Design and Development > Designing Components
EnrichPlatform
Talend Studio

How to create a custom component

Talend offers natively 800+ components within Talend Studio to address your data integration needs. However, if those components do not fulfil your specific needs, you can expand Talend Studio capabilities by creating your own components.

This article describes in detail how to create a component manually step by step and use it within any of your data integration Jobs.

This tutorial covers the following topics, which are essential for developing custom components:

  • How to create a component step by step.
  • How to install a custom component into your Talend Studio

As a prerequisite, we strongly recommend that you read the articles What is a component and Component code generation model to know how a Talend component is structured, and learn about the component code generation order and related technologies.

Creating a custom component step by step

This procedure will help you create a component manually step by step.

In this example the component you will create is called tTutorialRow, it has only one parameter, of type TABLE. The component will handle a list of email addresses.

This is a very basic component, with no real added value. The main purpose of this tutorial is to put into practice the theory presented in the articles What is a component and Component code generation model, and complete the learning of component creation process.

Tip: Based on our experience, referring to the official Talend components is one of the best ways to learn and familiarize yourself with component development. For example, if you want to develop a database component to read data from a database, you can refer to tMysqlInput, tOracleInput etc., to see how to define it in the XML descriptor file and how it works in the java template file.

Step 1: Creating the component folder and the required files

Procedure

  1. Create a folder named tTutorialRow on your file system.
  2. Create the following five empty files in the newly created folder:
    • tTutorialRow_java.xml
    • tTutorialRow_messages.properties
    • tTutorialRow_begin.javajet
    • tTutorialRow_main.javajet
    • tTutorialRow_end.javajet
  3. Create an icon file named tTutorialRow_icon32.png with the size of 32*32 in the tTutorialRow folder.

    Now the tTutorialRow folder should look like:

Step 2: Editing the XML descriptor file

Procedure

  1. Open the XML descriptor file tTutorialRow_java.xml.
  2. Edit the content of tTutorialRow_java.xml as below:
    <COMPONENT>
      <HEADER
        PLATEFORM="ALL"
        SERIAL=""
        VERSION="2.0"
        STATUS="ALPHA"
      
        COMPATIBILITY="ALL"
        AUTHOR="Component Author"
        RELEASE_DATE="20070525A"
        STARTABLE="false"
      >
        <SIGNATURE/>
      </HEADER>
      
      <FAMILIES>
        <FAMILY>tutorial</FAMILY>
      </FAMILIES>
      
      <DOCUMENTATION>
        <URL/>
      </DOCUMENTATION>
      
      <CONNECTORS>
        <CONNECTOR CTYPE="FLOW" MAX_INPUT="1"/>
        <CONNECTOR CTYPE="ITERATE" MAX_OUTPUT="1" MAX_INPUT="1"/>
        <CONNECTOR CTYPE="SUBJOB_OK" MAX_INPUT="1" />
        <CONNECTOR CTYPE="SUBJOB_ERROR" MAX_INPUT="1" />
        <CONNECTOR CTYPE="COMPONENT_OK" />
        <CONNECTOR CTYPE="COMPONENT_ERROR" />
        <CONNECTOR CTYPE="RUN_IF" />
      </CONNECTORS>
      
      <PARAMETERS>
        <PARAMETER NAME="ADDRESSES" FIELD="TABLE" REQUIRED="true" NUM_ROW="3" NB_LINES="5" SHOW="true">
          <ITEMS BASED_ON_SCHEMA="false">
            <ITEM NAME="USERNAME" />
            <ITEM NAME="DOMAIN" />
          </ITEMS>
        </PARAMETER>
      </PARAMETERS>
      
      <CODEGENERATION/>
      
      <RETURNS>
        <RETURN NAME="NB_LINE" TYPE="id_Integer" AVAILABILITY="AFTER"/>
      </RETURNS>
      
    </COMPONENT>

    As described in this descriptor file:

    • The component family is "tutorial".
    • The component has a parameter named "ADDRESSES" of the TABLE type.
    • The parameter has two items (which correspond to columns), "USERNAME" and "DOMAIN".
    • The component returns a NB_LINE global variable.

Step 3: Editing the message properties file

Procedure

  1. Open the message properties file tTutorialRow_messages.properties in a text file editor.
  2. Edit the default label for each of the variables declared in the XML descriptor file.
    LONG_NAME=Tutorial component
    HELP=org.talend.help.TutorialRow
     
    NB_LINE.NAME=Number of line
    ADDRESSES.ITEM.USERNAME=Username
    ADDRESSES.ITEM.DOMAIN=Domain
    ADDRESSES.NAME=Addresses

    According to the parameters settings in the XML descriptor file and the labels in the message properties file, the component settings in the Studio will look like the following:

Step 4: Editing the java template files

Procedure

  1. Open tTutorialRow_begin.javajet in a text file editor and define the beginning of the component javejet code as follows:
    <%@ jet
        imports="
            org.talend.core.model.process.INode
            org.talend.core.model.process.ElementParameterParser
            org.talend.core.model.metadata.IMetadataTable
            org.talend.core.model.metadata.IMetadataColumn
            org.talend.core.model.process.IConnection
            org.talend.core.model.process.IConnectionCategory
            org.talend.designer.codegen.config.CodeGeneratorArgument
            org.talend.core.model.metadata.types.JavaTypesManager
            org.talend.core.model.metadata.types.JavaType
            java.util.List
            java.util.Map      
        "
    %>
    <%
        CodeGeneratorArgument codeGenArgument = (CodeGeneratorArgument) argument;
        INode node = (INode)codeGenArgument.getArgument();
        String cid = node.getUniqueName(); 
        List<Map<String, String>> lines = (List<Map<String,String>>)ElementParameterParser.getObjectValue(node, "__ADDRESSES__");
    %>
    java.util.List<String> addresses_<%=cid %> = new java.util.ArrayList<String>();
    <%
      for (int i=0; i<lines.size(); i++) {
        Map<String, String> line = lines.get(i);
    %>
        addresses_<%=cid %>.add(<%= line.get("USERNAME") %> + "@" + <%= line.get("DOMAIN") %>);
    <%
      }
    %>
    int nb_line_<%=cid %> = 0;

    This javajet file defines three variables:

    • lines, of type List, to store all the rows of the table filled in by the user.
    • addresses_<%=cid %>, of type List, to store the concatenations of the USERNAME column and the DOMAIN column.
    • nb_line, to populate the number of lines processed by the tTutorialRow component.
  2. Open tTutorialRow_main.javajet in a text file editor and define the main part of the component javejet code as follows:
    <%@ jet
        imports="
            org.talend.core.model.process.INode
            org.talend.core.model.process.ElementParameterParser
            org.talend.core.model.metadata.IMetadataTable
            org.talend.core.model.metadata.IMetadataColumn
            org.talend.core.model.process.IConnection
            org.talend.core.model.process.IConnectionCategory
            org.talend.designer.codegen.config.CodeGeneratorArgument
            org.talend.core.model.metadata.types.JavaTypesManager
            org.talend.core.model.metadata.types.JavaType
            java.util.List
            java.util.Map      
        "
    %>
    <%
        CodeGeneratorArgument codeGenArgument = (CodeGeneratorArgument) argument;
        INode node = (INode)codeGenArgument.getArgument();
        String cid = node.getUniqueName(); 
    %>
        String[] adresses_<%=cid %> = addresses_<%=cid %>.toArray(new String[] {});
         
        System.out.print(nb_line_<%=cid %>++ + ": ");
        for (int i_<%=cid %> = 0; i_<%=cid %> < adresses_<%=cid %>.length; i_<%=cid %>++ )
        {
          System.out.print(adresses_<%=cid %>[i_<%=cid %>]);
          if (i_<%=cid %> < adresses_<%=cid %>.length-1) System.out.print(",");
        }  
        System.out.println();

    As described in the article Component code generation model, the main part of the Java code will be executed at each line of the incoming data flow. It will print the line number and the list of email addresses to the console.

  3. Open tTutorialRow_end.javajet in a text file editor and edit the end of the component javejet code as follows:
    <%@ jet
        imports="
            org.talend.core.model.process.INode
            org.talend.core.model.process.ElementParameterParser
            org.talend.core.model.metadata.IMetadataTable
            org.talend.core.model.metadata.IMetadataColumn
            org.talend.core.model.process.IConnection
            org.talend.core.model.process.IConnectionCategory
            org.talend.designer.codegen.config.CodeGeneratorArgument
            org.talend.core.model.metadata.types.JavaTypesManager
            org.talend.core.model.metadata.types.JavaType
            java.util.List
            java.util.Map      
        "
    %>
    <%
        CodeGeneratorArgument codeGenArgument = (CodeGeneratorArgument) argument;
        INode node = (INode)codeGenArgument.getArgument();
        String cid = node.getUniqueName(); 
    %>  
        globalMap.put("<%=cid %>_NB_LINE",nb_line_<%=cid %>);

    In this example, the nb_line variable is put in a globalMap so that it can be reused in the Job. The suffix of the key is NB_LINE, which is the variable name declared at the end of the XML descriptor file tTutorialRow_java.xml.

Installing and testing the created component

Once you have created and defined all the files for a custom component, you can install it into your Talend Studio so that you can use it in your Jobs.

This procedure outlines the steps to install the tTutorialRow component you have just created. For more information on installing, updating and troubleshooting a custom component, see How to install and update a custom component.

Before you begin

If your newly created component requires some jars to function, make sure those jars are always available in <studio>/configuration/.m2/repository/org/talend/libraries. For more information regarding how to import external jars into the Studio, read Installing external modules.

Procedure

  1. Create a directory dedicated for custom components on your file system, for example: D:/custom_components.
  2. Copy the component folder tTutorialRow and paste it to the dedicated component directory.

    The path of the component is now D:/custom_components/tTutorialRow/.

  3. Launch your Talend Studio.
  4. Select Window > Preferences from the menu, expand Talend > Components, and browse to the User component folder field.
  5. Type in your component folder path and click OK.
  6. Search for the component named tTutorialRow in the search field of the Palette to check that you have successfully installed the component.
  7. Check that the component works in a Talend Job.

    As an example, create a simple Job that contains a tRowGenerator and a tTutorialRow, and configure the components as follows:

    • Configure the tRowGenerator to generate five lines of random strings.
    • Specify the user names and domain names to construct email addresses in the Addresses table of tTutorialRow component.

    When the Job is executed, the tTutorialRow is expected to print the current line number and the list of email addresses in the Job console.

More useful information