After running your analysis using the SQL or the Java engine and from the Analysis Results view of the analysis editor, you can right-click any of the rows in the statistic result tables and access a view of the actual data.
After running your analysis using the Java engine, you can use the analysis results to access a view of the actual data.
After running your analysis using the SQL engine, you can use the analysis results to open the Data Explorer perspective and access a view of the actual data.
Prerequisite(s):You have selected the Profiling perspective in the studio. A column analysis has been created and executed.
To view and export the analyzed data, do the following:
At the bottom of the analysis editor, click the Analysis Results tab to open a detailed view of the analysis results.
Right-click a data row in the statistic results of the analyzed columns and select an option as the following:
open a view on a list of all data rows in the analyzed column.
For the Duplicate Count indicator, the View rows option will list all the rows that are duplicated. So if the duplicate count is 12 for example, this option will list 24 rows.
open a view on a list of the actual data values of the analyzed column.
generate a ready-to-use Job that identifies and separates unique and duplicate records in the selected column for subsequent processing. This Job outputs all the duplicates in a reject .csv file by default, and writes the unique values in another separate file. For further information, see Generating a Job to Identify duplicate values in an analyzed column.
Options other than the above listed ones are available when using regular expressions and SQL patterns in a column analysis. For further information, see Using regular expressions and SQL patterns in a column analysis and How to view the data analyzed against patterns.
When using the SQL engine, the view opens in the Data Explorer perspective listing the rows or the values of the analyzed data according to the limits set in the data explorer.
If the Data Explorer perspective is missing from the studio, you must install certain SQL explorer libraries that are required for data quality to work correctly, otherwise you may get an error message.
For further information about identifying and installing external modules, see the Talend Installation Guide.
This explorer view will give also some basic information about the analysis itself. Such information is of great help when working with multiple analysis at the same time.
The data explorer does not support connections which has empty user name, such as Single sign-on of MS SQL Server. If you analyze data using such connection and you try to view data rows and values in the Data Explorer perspective, a warning message prompt you to set your connection credentials to the SQL Server.
When using the Java engine, the view opens in the studio listing the number of the analyzed data rows you set in the Analysis parameters view of the analysis editor. For more information, see Using the Java or the SQL engine.
From this view, you can export the analyzed data into a csv file. To do that:
Click the icon in the upper left corner of the view.
A dialog box opens.
Click the Choose... button and browse to where you want to store the csv file and give it a name.
Click OK to close the dialog box.
A csv file is created in the specified place holding all the analyzed data rows listed in the view.