Creating a database content analysis - 6.5

Talend Open Studio for MDM User Guide

EnrichVersion
6.5
EnrichProdName
Talend Open Studio for MDM
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

From the Profiling perspective of the studio, you can create an analysis to examine the content of a given database.

Prerequisite(s): At least, one database connection is set in the Profiling perspective of the studio. For further information, see Connecting to a database.

To create a database content analysis, you must first define the relevant analysis and then select the database connection you want to analyze.

Defining the analysis

  1. In the DQ Repository tree view, expand Data Profiling.

  2. Right-click the Analyses folder and select New Analysis.

    The [Create New Analysis] wizard opens.

  3. In the filter field, start typing connection overview analysis, select Connection Overview Analysis from the list that is displayed and click Next.

    You can create a database content analysis in a shortcut procedure if you right-click the database under Metadata > DB connections and select Overview analysis from the contextual menu.

  4. In the Name field, enter a name for the current analysis.

    Note

    Avoid using special characters in the item names including:

    "~", "!", "`", "#", "^", "&", "*", "\\", "/", "?", ":", ";", "\"", ".", "(", ")", "'", "¥", "'", """, "«", "»", "<", ">".

    These characters are all replaced with "_" in the file system and you may end up creating duplicate items.

  5. Set the analysis metadata (purpose, description and author name) in the corresponding fields and click Next.

Selecting the database connection you want to analyze

  1. Expand DB Connections and select a database connection to analyze, if more than one exists.

  2. Click Next.

  3. Set filters on the tables and/or views you want to analyze in their corresponding fields according to your needs using the SQL language.

    By default, the analysis examines all tables and views in the database.

  4. Click Finish to close the [Create New Analysis] wizard.

    A folder for the newly created analysis is listed under the Analyses folder in the DQ Repository tree view, and the connection editor opens with the defined metadata.

    Note

    The display of the connection editor depends on the parameters you set in the [Preferences] window. For more information, see Setting preferences of analysis editors and analysis results.

  5. Click Analysis Parameters and:

    • In the Number of connections per analysis field, set the number of concurrent connections allowed per analysis to the selected database connection.

      You can set this number according to the database available resources, that is the number of concurrent connections each database can support.

    • Check/modify filters on table and/or views, if any.

    • Select the Reload databases check box if you want to reload all databases in your connection on the server when you run the overview analysis.

      When you try to reload a database, a message will prompt you for confirmation as any change in the database structure may affect existing analyses.

  6. In the Context Group Settings view, select from the list the context environment you want to use to run the analysis.

    The table in this view lists all context environments and their values you define in the Contexts view in the analysis editor. For further information, see Using context variables in analyses.

  7. Press F6 to execute the analysis.

    A message opens at the bottom of the editor to confirm that the operation is in progress and analysis results are opened in the Analysis Results view.

From the Statistical information view, you can:

  • Click a catalog or a schema to list all tables included in it along with a summary of their content: number of rows, keys and user-defined indexes.

    The selected catalog or schema is highlighted in blue. Catalogs or schemas highlighted in red indicate potential problems in data.

  • Right-click a catalog or a schema and select Overview analysis to analyze the content of the selected item.

  • Right-click a table or a view and select Table analysis to create a table analysis on the selected item. You can also view the keys and indexes of a selected table. For further information, see Displaying keys and indexes of database tables.

  • Click any column header in the analytical table to sort alphabetically the data listed in catalogs or schemas.