Available in...
Big Data Platform
Cloud API Services Platform
Cloud Big Data Platform
Cloud Data Fabric
Cloud Data Management Platform
Data Fabric
Data Management Platform
Data Services Platform
MDM Platform
Real-Time Big Data Platform
You can generate a ready-to-use Job on the results of a column analysis. This Job
recuperates the valid/invalid rows or both types of rows and writes them in output files
or databases.
Before you begin
Procedure
- Follow the steps outlined in Defining the columns to be analyzed and Adding a regular expression or an SQL pattern to a column analysis to create a column analysis that uses a pattern.
- Execute the column analysis.
-
In the Analysis Results view, click Pattern
Matching under the name of the analyzed column.
The generated graphic for the pattern matching is displayed accompanied with a table that details the matching results.
-
Right-click the pattern line in the Pattern Matching
table and select Generate Jobs.
The Job Selector dialog box is displayed.
When you analyze the column using a pattern that is defined for a specific database, you will be able to generate ELT Jobs.When you analyze the column using a pattern that is defined for the Java or the Default language, you will be able to generate an ETL Job. -
In the dialog box, select:
Option To... generate an ELT job to get only valid rows to generate a Job that uses the Extract Load Transform process to write the valid rows of the analyzed column in an output file. This option is not available for the Amazon Redshift database.
generate an ELT job to get only invalid rows to generate a Job that uses the Extract Load Transform process to write the invalid rows of the analyzed column in an output file. This option is not available for the Amazon Redshift database.
generate an ETL job to handle rows to generate a Job that uses the Extract Transform Load process to write the valid/invalid rows of the analyzed column in output files. In this example we select the generate an ETL job to handle rows option to generate a Job that will output in two separate output files the valid and invalid email rows. -
In the dialog box, click Finish to proceed to the next
step.
The Integration perspective opens on the generated Job.
- Optional: Use different output components to recuperate the valid/invalid rows in different type of files or in databases.
-
Save your Job and press F6 to execute it.
The valid and invalid email rows of the analyzed column are written in the defined output files.The results in the retrieved files may depend on the ETL or ELT mode. In the ETL mode, the data is retrieved against Java regular expressions while in the ELT mode, the data is retrieved against the appropriate database regular expressions. The regular expression engines work differently in Java and in the DBMS, hence the result may differ, even more if you defined different regular expressions in the pattern editor.