Data Sampling and Profiling Options

One may customize the data sampling and profiling request or scheduled action. Please see the technical details for clarity.

Data Sampling – Enable data sampling and specify number of rows to sample.
Data Profiling – Enable data profiling and specify number of rows to use in profiling.
Data Select Method – with a choice of the fast method Top (the default) vs. Random (reservoir sampling when available on the database)
Profile only objects that are not profiled yet - Enable data profiling only on imported objects which have not been profiled.
Data Classification – Enable data classification.
Hide data using Sensitivity Label – The selected sensitivity label will be applied to all new imported objects in the scope (in order to hide them).

In addition, there are inferred sensitivity labels so that when you apply a sensitivity label to an imported object, e.g. a column, then all the imported objects “downstream” in the data flow lineage will be given at least that level of sensitivity as "Sensitivity Label Lineage Proposed". This means you will see automatic sensitive label tagging by inference across the enterprise architecture. As with "Sensitivity Label Data Proposed", the "Sensitivity Label Lineage Proposed" can be rejected, therefore stopping the propagation of inferred sensitivity labels in that data flow direction. Note that the propagation of inferred sensitivity level is also not inferred by any data masking discovered within the ETL/DI/Scrip imports involved in that data flow.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!

Leave your feedback here