Creating a profiling analysis on an ADLS Databricks file via Hive - Cloud - 7.3

Talend Studio User Guide

Version
Cloud
7.3
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development
Last publication date
2024-02-13
Available in...

Big Data Platform

Cloud API Services Platform

Cloud Big Data Platform

Cloud Data Fabric

Cloud Data Management Platform

Data Fabric

Data Management Platform

Data Services Platform

MDM Platform

Real-Time Big Data Platform

After creating a connection to an ADLS Databricks cluster via Hive, you can create a profiling analysis on a specific file.

Before you begin

Procedure

  1. In the DQ Repository tree view, expand Metadata > DB Connections > the JDBC connection > Tables.
  2. In the Columns folder, select the columns you want to analyze and right-click them.
    Tip: To create an analysis on all columns, right-click the table name.
  3. Hover over Column Analysis and select the analysis type you need.
    The Create New Analysis wizard is displayed.
  4. Enter a name and click Finish. The other fields are optional.
    A new analysis on the selected ADLS files is automatically created and opened in the analysis editor. Depending on the analysis type you have selected, the indicators are automatically assigned for columns.

    The analysis applies to the Hive table, but computes statistics on the data from the ADLS by using the External table mechanism. External tables keep data in the original file outside of Hive. If the ADLS file you selected to analyze is deleted, the analysis will not be able to run anymore.

  5. If needed:
    • Modify the columns to be analyzed: In the Data Preview tab, click Select Columns.
    • Add more indicators or new patterns to the columns: In the Analyzed Columns tab, click Select Indicators.
  6. Run the analysis to display the results in the Analysis Results view in the editor.

What to do next

You can create a report on this analysis. See Creating a report on specific analyses.