Selecting the identical columns you want to compare - 7.1

Talend Real-time Big Data Platform Studio User Guide

author
Talend Documentation Team
EnrichVersion
7.1
EnrichProdName
Talend Real-Time Big Data Platform
task
Design and Development
EnrichPlatform
Talend Studio

Procedure

  1. Expand DB connections and in the desired database, browse to the columns you want to analyze, select them and then click Finish to close the wizard.
    A file for the newly created analysis is listed under the Analysis folder in the DQ Repository tree view. The analysis editor opens with the defined analysis metadata.
    The display of the analysis editor depends on the parameters you set in the Preferences window. For more information, see Setting preferences of analysis editors and analysis results.
  2. Click Analyzed Column Sets to open the view where you can set the columns or modify your selection.
    In this example, you want to compare identical columns in the account and account_back tables.
  3. From the Connection list, select the database connection relevant to the database to which you want to connect.
    You can find in this list all the database connections you create and centralize in the Studio repository.
  4. Click A column Set to open the Column Selection dialog box.
  5. Browse the catalogs/schemas in your database connection to reach the table holding the columns you want to analyze.
    You can filter the table or column lists by typing the desired text in the Table filter or Column filter fields respectively. The lists will show only the tables/columns that correspond to the text you type in.
  6. Click the table name to list all its columns in the right-hand panel of the Column Selection dialog box.
  7. In the list to the right, select the check boxes of the column(s) you want to analyze and click OK to proceed to the next step.
    You can drag the columns to be analyzed directly from the DQ Repository tree view to the editor.
    If you right-click any of the listed columns in the Analyzed Columns view and select Show in DQ Repository view, the selected column will be automatically located under the corresponding connection in the tree view.
  8. Click B Column B Set and follow the same steps to select the second set of columns or drag it to the right column panel.
  9. Select the Compute only number of A rows not in B check box if you want to match the data from the A set against the data from the B set and not vice versa.