Viewing and exporting analyzed data - Cloud - 7.3

Talend Studio User Guide

Version
Cloud
7.3
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development
Last publication date
2024-02-13
Available in...

Big Data Platform

Cloud API Services Platform

Cloud Big Data Platform

Cloud Data Fabric

Cloud Data Management Platform

Data Fabric

Data Management Platform

Data Services Platform

MDM Platform

Real-Time Big Data Platform

Before you begin

  • You have selected the Profiling perspective.
  • A column analysis has been created and executed.

About this task

After running your analysis:
  • Using the SQL or the Java engine and from the Analysis Results view of the analysis editor, you can right-click any of the rows in the statistic result tables and access a view of the actual data.
  • Using the Java engine, you can use the analysis results to access a view of the actual data.
  • Using the SQL engine, you can use the analysis results to open the Data Explorer perspective and access a view of the actual data.

Procedure

  1. At the bottom of the analysis editor, click the Analysis Results tab to open a detailed view of the analysis results.
  2. Right-click a data row in the statistic results of any of the analyzed columns and select an option as the following:

    Option

    Operation

    View rows

    Open a view on a list of all data rows in the analyzed column.

    Note: For the Duplicate Count indicator, the View rows option will list all the rows that are duplicated. So if the duplicate count is 12 for example, this option will list 24 rows.

    View values

    Open a view on a list of the actual data values of the analyzed column.

    Identify duplicates

    generate a ready-to-use Job that identifies and separates unique and duplicate records in the selected column for subsequent processing. This Job outputs all the duplicates in a reject CSV file by default, and writes the unique values in another separate file. For further information, see Generating a Job to Identify duplicate values in an analyzed column.

    Options other than the above listed ones are available when using regular expressions and SQL patterns in a column analysis.

    When using the SQL engine, the view opens in the Data Explorer perspective listing the rows or the values of the analyzed data according to the limits set in the data explorer.

    If the Data Explorer perspective is missing from the Studio, you must install certain SQL explorer libraries that are required for data quality to work correctly, otherwise you may get an error message.

    For further information about identifying and installing external modules, see the Talend Installation and Upgrade Guide.

    Warning: The data explorer does not support connections which has empty user name, such as Single sign-on of MS SQL Server. If you analyze data using such connection and you try to view data rows and values in the Data Explorer perspective, a warning message prompt you to set your connection credentials to the SQL Server.

    When using the Java engine, the view opens in the Studio listing the number of the analyzed data rows you set in the Analysis parameters view of the analysis editor.

  3. From this view, you can export the analyzed data into a CSV file:
    1. Click the icon in the upper left corner of the view.
    2. A dialog box opens.
    3. A dialog box opens.
    4. Click Choose... and browse to where you want to store the csv file and give it a name.
    5. Click OK to close the dialog box.
      A csv file is created in the specified place holding all the analyzed data rows listed in the view.