Viewing and exporting analyzed data - Cloud - 8.0

Talend Studio User Guide

Version
Cloud
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development
Last publication date
2024-02-13
Available in...

Big Data Platform

Cloud API Services Platform

Cloud Big Data Platform

Cloud Data Fabric

Cloud Data Management Platform

Data Fabric

Data Management Platform

Data Services Platform

MDM Platform

Real-Time Big Data Platform

Before you begin

  • You have selected the Profiling perspective.
  • A column analysis has been created and executed.

About this task

After running your analysis:
  • Using the SQL or the Java engine and from the Analysis Results view of the analysis editor, you can right-click any of the rows in the statistic result tables and access a view of the actual data.
  • Using the Java engine, you can use the analysis results to access a view of the actual data.
  • Using the SQL engine, you can use the analysis results to open the Data Explorer perspective and access a view of the actual data.
Contextual menu of analyzed data in the Simple Statistics section.

Procedure

  1. At the bottom of the analysis editor, click the Analysis Results tab to open a detailed view of the analysis results.
  2. Right-click a data row in the statistic results of any of the analyzed columns and select an option as the following:

    Option

    Operation

    View rows

    Open a view on a list of all data rows in the analyzed column.

    Note: For the Duplicate Count indicator, the View rows option will list all the rows that are duplicated. So if the duplicate count is 12 for example, this option will list 24 rows.

    View values

    Open a view on a list of the actual data values of the analyzed column.

    Identify duplicates

    Generate a ready-to-use Job that identifies and separates unique and duplicate records in the selected column for subsequent processing. This Job outputs all the duplicates in a reject CSV file by default, and writes the unique values in another separate file. For further information, see Generating a Job to Identify duplicate values in an analyzed column.

    Options other than the above listed ones are available when using regular expressions and SQL patterns in a column analysis.

    When using the SQL engine, the view opens in the Data Explorer perspective listing the rows or the values of the analyzed data according to the limits set in the data explorer.

    If the Data Explorer perspective is missing from Talend Studio, you must install certain SQL explorer libraries that are required for data quality to work correctly, otherwise you may get an error message.

    For further information about identifying and installing external modules, see Installing external modules to Talend Studio.

    Example of a query and the rows returned against this query.
    Warning: The data explorer does not support connections which has empty user name, such as Single sign-on of MS SQL Server. If you analyze data using such connection and you try to view data rows and values in the Data Explorer perspective, a warning message prompt you to set your connection credentials to the SQL Server.

    When using the Java engine, the view opens in Talend Studio listing the number of the analyzed data rows you set in the Analysis parameters view of the analysis editor.

    Overview of the View rows tab.
  3. From this view, you can export the analyzed data into a CSV file:
    1. Click Export to .csv icon in the upper left corner of the view.
    2. A dialog box opens.
      Overview of the CSV export options dialog box.
    3. Click Choose... and browse to where you want to store the CSV file and give it a name.
    4. Click OK to close the dialog box.
      A CSV file is created in the specified place holding all the analyzed data rows listed in the view.