Validating email addresses and showing duplicates - Cloud - 8.0

Talend Job Script Reference Guide

Version
Cloud
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend CommandLine
Talend Studio
Content
Design and Development > Designing Jobs
Last publication date
2024-02-22

Follow the procedure below to add and configure two tJavaRow components, one used to validate the duplicated email addresses and display the validation result, and the other to display the duplicate email addresses.

Procedure

  1. Next to the tUniqRow component settings, add a new component, tJavaRow.
    addComponent {
    	setComponentDefinition {
    		TYPE: "tJavaRow",
    		NAME: "tJavaRow_1",
    		POSITION: 640, 96
    	}
    
    }
  2. Next to the setComponentDefinition {} function of this tJavaRow component, define the component properties using the setSettings {} function.

    In this example, this tJavaRow component, labeled validate, will be used to execute a piece Java code to check whether the character string of each incoming row pertains to an email address, and then display the validation result.

    	setSettings {
    		CODE : "String email = input_row.email;
    
    Perl5Matcher matcher = new Perl5Matcher();
    Perl5Compiler compiler = new Perl5Compiler();
    Pattern pattern = compiler.compile(\"^[\\\\w_.-]+@[\\\\w_.-]+\\\\.[\\\\w]+$\");
    
    if (!matcher.matches(email, pattern)) {
    	System.out.println(\"invalid : \" + email);
    	}
    	else
    	System.out.println(\"valid   : \" + email);",
    		LABEL : "validate"
    	}
    
  3. Now add and configure the second tJavaRow, which is used to display the duplicate email addresses.
    addComponent {
    	setComponentDefinition {
    		TYPE: "tJavaRow",
    		NAME: "tJavaRow_2",
    		POSITION: 640, 288
    	}
    	setSettings {
    		CODE : "System.out.println(\"duplicate: \" + input_row.email);",
    		LABEL : "duplicates"
    	}
    }