Creating a schema analysis - 6.1

Talend Data Fabric Studio User Guide

EnrichVersion
6.1
EnrichProdName
Talend Data Fabric
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

You can use the Profiling perspective of the studio to analyze one specific schema in a database, if this entity is used in the physical structure of the database. The result of the analysis gives analytical information about the content of this schema, for example number of rows, number of tables, number of rows per table and so on.

Prerequisite(s): At least one database connection has been created to connect to a database that uses the "schema" entity, for example the DB2 database. For further information, see Connecting to a database.

Defining the analysis

  1. In the DQ Repository tree view, expand Data Profiling.

  2. Right-click the Analyses folder and select New Analysis.

    The [Create New Analysis] wizard opens.

  3. In the filter field, start typing schema, select Schema Overview Analysis and click Next.

  4. In the Name field, enter a name for the current analysis.

    Note

    Avoid using special characters in the item names including:

    "~", "!", "`", "#", "^", "&", "*", "\\", "/", "?", ":", ";", "\"", ".", "(", ")", "'", "¥", "'", """, "«", "»", "<", ">".

    These characters are all replaced with "_" in the file system and you may end up creating duplicate items.

  5. If required, set the analysis metadata (purpose, description and author name) in the corresponding fields and click Next to proceed to the next step.

Selecting the schema you want to analyze

  1. Expand in succession DB Connections and the database that include schema entities in its physical structure and select a schema to analyze.

  2. Click Next.

  3. Set filters on tables and/or views in their corresponding fields according to your needs using the SQL language.

    By default, the analysis will include all tables and views in the catalog.

  4. Click Finish to close the [Create New Analysis] wizard.

    A folder for the newly created analysis is listed under Analysis in the DQ Repository tree view, and the analysis editor opens with the defined metadata.

    Note

    The display of the analysis editor depends on the parameters you set in the [Preferences] window. For more information, see Setting preferences of analysis editors and analysis results.

  5. Click Analysis Parameters and:

    • In the Number of connections per analysis field, set the number of concurrent connections allowed per analysis to the selected database connection.

      You can set this number according to the database available resources, that is the number of concurrent connections each database can support.

    • Check/modify filters on table and/or views, if any.

  6. In the Context Group Settings view, select from the list the context environment you want to use to run the analysis.

    The table in this view lists all context environments and their values you define in the Contexts view in the analysis editor. For further information, see Using context variables in analyses.

  7. Click the save icon on top of the editor and then press F6 to execute the current analysis.

    A message opens to confirm that the operation is in progress.

    Analysis results are stored in the Statistical informations area.

  8. Click Statistical informations to show analytical information about the content of the relevant catalog.

From the Statistical information view, you can:

  • Click the schema in the analytical table to open a result list that details all tables included in the selected schema with a summary of their content.

    The selected schema is highlighted in blue. Schemas highlighted in red indicate potential problems in data.

  • Right-click a table or a view and select Table analysis to create a table analysis on the selected item.

  • Click any column header in the analytical table to sort the listed data alphabetically.