Validating email addresses and showing duplicates - 6.5

Talend Job Script Reference Guide

author
Talend Documentation Team
EnrichVersion
6.5
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
task
Design and Development > Designing Jobs
EnrichPlatform
Talend CommandLine
Talend Studio

Follow the procedure below to add and configure two tJavaRow components, one used to validate the duplicated email addresses and display the validation result, and the other to display the duplicate email addresses.

Procedure

  1. Next to the tUniqRow component settings, add a new component, tJavaRow.
    addComponent {
    	setComponentDefinition {
    		TYPE: "tJavaRow",
    		NAME: "tJavaRow_1",
    		POSITION: 640, 96
    	}
    
    }
  2. Next to the setComponentDefinition {} function of this tJavaRow component, define the component properties using the setSettings {} function.

    In this example, this tJavaRow component, labelled validate, will be used to execute a piece Java code to check whether the character string of each incoming row pertains to an email address, and then display the validation result.

    	setSettings {
    		CODE : "String email = input_row.email;
    
    Perl5Matcher matcher = new Perl5Matcher();
    Perl5Compiler compiler = new Perl5Compiler();
    Pattern pattern = compiler.compile(\"^[\\\\w_.-]+@[\\\\w_.-]+\\\\.[\\\\w]+$\");
    
    if (!matcher.matches(email, pattern)) {
    	System.out.println(\"invalid : \" + email);
    	}
    	else
    	System.out.println(\"valid   : \" + email);",
    		LABEL : "validate"
    	}
    
  3. Now add and configure the second tJavaRow, which is used to display the duplicate email addresses.
    addComponent {
    	setComponentDefinition {
    		TYPE: "tJavaRow",
    		NAME: "tJavaRow_2",
    		POSITION: 640, 288
    	}
    	setSettings {
    		CODE : "System.out.println(\"duplicate: \" + input_row.email);",
    		LABEL : "duplicates"
    	}
    }