TalendString Routines - 6.1

Talend Real-time Big Data Platform Studio User Guide

Talend Real-Time Big Data Platform
Data Quality and Preparation
Design and Development
Talend Studio

The TalendString routines allow you to carry out various operations on alphanumerical expressions.

To access these routines, double click on TalendString under the system folder. The TalendString class contains the following routines:





returns a string from which the special characters (eg.:: <, >, &...) have been replaced by equivalent XML characters.

TalendString.replaceSpecialCharForXML ("string containing the special characters - eg.: Thelma & Louise")


identifies characters starting with <![CDATA[ and ending with ]]> as pertaining to XML and returns them without modification. Transforms the strings not identified as XML in a form which is compatible with XML and returns them.

TalendString.checkCDATAForXML("string to be parsed")


parses the entry string and removes the filler characters from the start and end of the string according to the alignment value specified: -1 for the filler characters at the end of the string, 1 for those at the start of the string and 0 for both. Returns the trimmed string.

TalendString.talendTrim("string to be parsed", "filler character to be removed", character position)


removes accents from a string and returns the string without the accents.



generates a random string with a specific number of characters.

TalendString.getAsciiRandomString (whole number indicating the length of the string)

How to format an XML string

It is easy to run the replaceSpecialCharForXML routine along with a tJava component, to format a string for XML:

System.out.println(TalendString.replaceSpecialCharForXML("Thelma & Louise"));

In this example, the "&" character is replaced in order to make the string XML compatible:

How to trim a string

It is easy to use the talendTrim routine, along with a tJava component to remove the string padding characters from the start and end of the string:

System.out.println(TalendString.talendTrim("**talend open studio****",'*', -1));
System.out.println(TalendString.talendTrim("**talend open studio****",'*', 1)); 
System.out.println(TalendString.talendTrim("**talend open studio****",'*',0));

The star characters are removed from the start, then the end of the string and then finally from both ends:

How to remove accents from a string

It is easy to use the removeAccents routine, along with a tJava component, to replace the accented characters, for example:


The accented characters are replaced with non-accented characters: