Validates data using data quality rules from:
- Talend Cloud Data Stewardship, or
- The hybrid version of Talend Data Stewardship 8.0 R2022-07 or greater.
A data quality rule is a set of business requirements which defines the values your data must comply with.
To retrieve the data quality rules into a JAR file, tDataQualityRules lets Talend Studio connect to Talend Cloud Data Stewardship or the hybrid version of Talend Data Stewardship. Talend Studio uses the retrieved library so you can apply data quality rules to your data. For more information on how the component works, see tDataQualityRules local and Cloud/hybrid process.
- Valid:
- Valid: The data fulfill the condition and the validation expression or the data fulfill the alternative validation expression only.
- Not applicable (NA): The data do not fulfill the condition. The rule cannot be applied to the data.
These data follow the Main flow.
- Invalid:
- Invalid: The data fulfill the condition but not the validation expression.
- Not executable (NE): The rule cannot be executed on the data.
These data follow the Reject flow.
This component is not shipped with your Talend Studio by default. You need to install it using the Feature Manager. For more information, see Installing features using the Feature Manager.
- In local mode, Apache Spark 3.0 and greater.
- Cloudera Data Engineering service with Apache Spark 3.1 or 3.2.
For more technologies supported by Talend, see Talend components.
Depending on the Talend product you are using, this component can be used in one, some or all of the following Job frameworks:
-
Standard: see tDataQualityRules Standard properties.
The component in this framework is available in Talend Data Management Platform, Talend Big Data Platform, Talend Real Time Big Data Platform, Talend Data Services Platform, and in Talend Data Fabric.
-
Spark Batch: see tDataQualityRules properties for Apache Spark Batch.
The component in this framework is available in all Talend Platform products with Big Data and in Talend Data Fabric.
-
Spark Streaming: see tDataQualityRules properties for Apache Spark Streaming.
This component is available in Talend Real Time Big Data Platform and Talend Data Fabric.