Detailed description of tables - 6.3

Talend Data Fabric Studio User Guide

EnrichVersion
6.3
EnrichProdName
Talend Data Fabric
task
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Reports are aggregations of analyses. A full report is composed of several rows (as many analyses as the report contains).

TDQ_ANALYSIS

  • AN_PK: primary key (artificial)

  • AN_UUID: analysis identifier

  • AN_LABEL: analysis name

  • AN_CREATION_DATE: analysis creation date

  • AN_AUTHOR: analysis author

  • AN_DATA_FILTER: the data filter of the analysis

  • AN_STATUS: analysis status

  • REP_UUID: report identifier

  • REP_CREATION_DATE: report creation date

  • REP_LABEL: report name

  • REP_AUTHOR: report author

  • REP_STATUS: report status

  • AN_BEGIN_DATE: the date when the row is created

  • AN_END_DATE: the date indicating a change in the attributes of the pair (report/analysis)

  • AN_VERSION: the version number of the (report, analysis) which is incremented each time the pair's data change

  • AN_PURPOSE: analysis purpose

  • AN_DESCRIPTION: analysis description

  • REP_PURPOSE: report purpose

  • REP_DESCRIPTION: report description

  • AN_TYPE: the type of analysis

TDQ_ANALYZED_ELEMENT

  • ELT_PK: primary key (artificial)

  • ELT_UUID: identifier of analyzed element

  • ELT_CONNECTION_UUID: connection identifier

  • ELT_CONNECTION_NAME: connection name

  • ELT_CATALOG_NAME: catalog name or Talend default value corresponding to a null value (see SqlConstants.java)

  • ELT_SCHEMA_NAME: schema name or Talend default value corresponding to a null value (see SqlConstants.java)

  • ELT_TABLE_NAME: table name or Talend default value corresponding to a null value (see SqlConstants.java)

  • ELT_COLUMN_NAME: column name or Talend default value corresponding to a null value (see SqlConstants.java)

  • ELT_BEGIN_DATE: historization columns

  • ELT_END_DATE: historization columns

  • ELT_VERSION: historization columns

  • ELT_CONNECTION_URL: connection URL of the database containing the analyzed element

  • ELT_SIZE: analyzed columns size

  • ELT_PROJECT_NAME: current project name

  • ELT_PROJECT_DESCRIPTION: current project description

  • ELT_CTX_GROUP_ID: the context group id

  • ELT_CTX_GROUP_NAME: the context group name

  • ELT_CTX_NAME: the context name

TDQ_ANALYZED_SET

  • AN_SET_PK: primary key (artificial)

  • MATCH_IND_PK: matching indicator id for the A and B sets

  • ELT_A_PK: the element belonging to set A of the matching indicator

  • ELT_B_PK: the element belonging to set B of the matching indicator

TDQ_CALENDAR

  • CAL_PK: primary key

  • CAL_DATE: date

  • CAL_DAY_IN_MONTH: month date

  • CAL_DAY_IN_YEAR: day count in the year

  • CAL_LAST_WEEK_DAY: count of the last day of the current week (CAL_DAY_IN_YEAR) in the year

  • CAL_LAST_MONTH_DAY: count of the last day of the current month (CAL_DAY_IN_YEAR) in the year

  • CAL_WEEK_DAY: day count in the week

  • CAL_DAY_NAME: name of the day

  • CAL_WEEK_NUM: week count

  • CAL_WEEK_PERIOD: the period (yyyyww) where (y) is year, (w) is week count

  • CAL_MONTH: month count

  • CAL_MONTH_NAME: month name

  • CAL_YEAR: year

  • CAL_MONTH_PERIOD: string representing the period: (yyyymm)

  • CAL_QUARTER: quarter 1,2,3 or 4

  • CAL_QUARTER_PERIOD: the period (yyyyq) where (y) is year, (q) is quarter among 1,2,3,4

  • CAL_SEMESTER: semester (1 or 2)

  • CAL_SEMESTER_PERIOD: the period (yyyys) where (y) is year, (s) is semester count

  • CAL_DAY_OFF: the flag to indicate days-off

  • CAL_SPECIAL_DAY: the flag to indicate special dates, Y indicates a special day, N indicates that the day is not special

  • CAL_SPEC_DAY_LABEL: label of the special day

TDQ_DAY_TIME

The time data is stored in UTC (Coordinated Universal Time).

  • TIME_PK: primary key (artificial)

  • TIME_HOUR: hour (0-23)

  • TIME_MINUTES: minutes (0-59)

  • TIME_LABEL: time label

TDQ_INDICATOR_DEFINITION

  • IND_PK: primary key (artificial)

  • IND_LABEL: indicator label

  • IND_UUID: indicator identifier

  • IND_CATEGORY: the category of the indicator.

    Simple indicators such as count indicators belong to the Simple Statistics category, Count subcategory. Text indicators belong to the Text Statistics category. Mean, Median, Lower quartile, upper quartile, min value, max value belong to the Summary Statistics category. Mode belong to the Advanced Statistics category and Mode subcategory. Frequency table belong to the Advanced Statistics category and Frequencies subcategory.

  • IND_CATEG_UUID: category identifier

  • IND_CRITICAL: criticality level of the indicator

  • IND_PURPOSE: indicator purpose

  • IND_DESCRIPTION: indicator description

And the 3 historization columns:

  • IND_BEGIN_DATE

  • IND_END_DATE

  • IND_VERSION

TDQ_INDICATOR_OPTIONS

  • INO_PK: primary key (artificial)

  • INO_CASE_SENSITIVE: case sensitive flag Y, N, U

  • INO_REGEXP: regular expression

  • INO_COUNT_NULLS: flag for counting null values Y, N

  • INO_COUNT_BLANKS: flag for counting blank values Y, N

TDQ_INDICATOR_VALUE

  • INDV_PK: primary key (artificial)

  • TIME_PK: foreign keys to time dimension tables

  • ELEMENT_PK: foreign keys to the analyzed element dimension tables

  • ANALYSIS_PK: foreign keys to the analysis dimension table

  • VALUE_PK: foreign keys to the value dimension tables

  • OPTION_PK: foreign keys to the option dimension tables

  • INDICATOR_PK: foreign keys to the indicator definition dimension tables

  • CALENDAR_PK: foreign keys to the date dimension tables

  • IND_TYPE: data mining type

  • AN_DURATION: duration of the analysis

  • AN_RUNTIME: execution date of the analysis

  • INDV_REAL_VALUE: real value of the indicator

  • INDV_INT_VALUE: integer value of the indicator

  • INDV_ROW_COUNT: number of rows when evaluating this indicator

  • INDV_VALUE_TYPE_INDICATOR: flag to tell whether it is a real value indicator

  • INDV_DTHRESH_OK: any of the flags Y or N or U which indicates whether this indicator respects the data threshold.Y means yes, N means no, U means undefined

  • INDV_DTHRESH_LOW: lower value of data threshold

  • INDV_DTHRESH_HI: higher value of data threshold

  • INDV_ITHRESH_OK: any of the flags Y or N or U to indicate whether this indicator respects the indicator threshold.Y means yes, N means no, U means undefined

  • INDV_ITHRESH_LOW: lower value of indicator threshold

  • INDV_ITHRESH_HI: higher value of indicator threshold

  • INDV_EXP_VAL_OK: a flag Y or N or U which indicates whether this indicator respects the indicator expected value. Y means yes, N means no, U means undefined

  • INDV_EXP_VAL: expected value of the indicator

  • INDV_ITHRESH_PERC_OK: any of the flags Y or N or U which indicates whether this indicator respects the defined percentage threshold.Y means yes, N means no, U means undefined

  • INDV_ITHRESH_PERC_LOW: lower threshold of indicator value in percentage relative to the total row count

  • INDV_ITHRESH_PERC_HI: higher threshold of indicator value in percentage relative to the total row count

  • INDV_DATE_VALUE: stores the date value of the indicator

  • INDV_IDATETHRESH_LO: stores the lower threshold on date

  • INDV_IDATETHRESH_HI: stores the higher threshold on date

  • INDV_IDATETHRESH_OK: a flag Y or N or U to indicate whether this indicator respects the date thresholds."Y" means yes, "N" means no, "U" means undefined

  • REP_DURATION: the duration of the report

  • REP_RUNTIME: execution date of the report

TDQ_MATCH_INDVALUE

  • M_PK: primary key

  • OPTION_PK: reference to the options

  • ANALYSIS_PK: foreign key to the analysis dimension table

  • TABLE_A_PK: foreign key to the table containing the A set

  • TABLE_B_PK: foreign key to the table containing the B set

  • INDICATOR_PK: foreign key to the indicator definition

  • AN_DURATION: duration of the execution of the analysis

  • TIME_PK: foreign key to the time of the analysis execution

  • CALENDAR_PK: foreign key to the date of the analysis execution

  • AN_RUNTIME: date of the analysis execution

  • INDV_MATCH_PERCENT: percentage of matching values

  • INDV_MATCH_COUNT: count of matching values

  • INDV_ROW_COUNT: total row count

  • INDV_DTHRESH_OK: flag Y or N or U which indicates whether this indicator respects the data threshold."Y" means yes, "N" means no, "U" means undefined

  • INDV_DTHRESH_LOW: lower value of data threshold

  • INDV_DTHRESH_HI: higher value of data threshold

  • INDV_ITHRESH_OK: flag Y or N or U which indicates whether this indicator respects the indicator threshold. Y means yes, N means no, U means undefined

  • INDV_ITHRESH_LOW: lower value of indicator threshold

  • INDV_ITHRESH_HI: higher value of indicator threshold

  • INDV_EXP_VAL_OK: flag Y or N or U which indicates whether this indicator respects the indicator expected value. Y means yes, N means no, U means undefined

  • INDV_EXP_VAL: expected value of the indicator

  • REP_DURATION: the duration of the report

  • REP_RUNTIME: execution date of the report

TDQ_MATCHING_INDVALUE

  • M_PK: primary key

  • ANALYSIS_PK: foreign key to the analysis dimension table

  • TABLE_PK: foreign key to the analyzed table

  • TIME_PK: foreign key to the time of the analysis execution

  • CALENDAR_PK: foreign key to the date of the analysis execution

  • AN_DURATION: duration of the execution of the analysis

  • AN_RUNTIME: date of the analysis execution

  • VERSION: reference to the version number in the table TDQ_TABLE_ANALIZED_SET

  • M_ROW_COUNT: total row count

  • M_UNIQUE_COUNT: count of unique records

  • M_SUSPECT_COUNT: count of suspect records

  • M_MATCH_COUNT: count of matched records

  • REP_DURATION: duration of the report execution

  • REP_RUNTIME: execution date of the report

The dimension tables that relate to this fact table are the same as those of TDQ_SET_INDVALUE table except for IND_PK which does not exist in the TDQ_MATCHING_INDVALUE fact table.

TDQ_GROUP_STATISTICS

  • G_PK: primary key

  • M_PK: foreign key to the matching indicator results (M_PK of TDQ_MATCHING_INDVALUE

  • G_SIZE: group size

  • G_GROUP_COUNT: group count

  • G_RECORD_COUNT: record count

  • REP_RUNTIME: execution date of the report

TDQ_OVERVIEW_INDVALUE

  • TV_PK: primary key (technical key)

  • AN_PK: foreign key to the analysis

  • CAL_PK: foreign key to the execution date of the analysis

  • TIME_PK: foreign key to the execution time of the analysis

  • IND_PK: foreign key to the indicator definition

  • ELT_PK: foreign key to the analyzed element (for tables, the analyzed element references their parent schema or catalog)

  • TV_NAME: table of View name. Can be empty when the row represents the level of schema or catalog or connection

  • TV_CATALOGCOUNT: number of catalogs

  • TV_SCHEMACOUNT: number of schemas

  • TV_TABLECOUNT: number of tables

  • TV_ROWCOUNT: number of rows in tables

  • TV_KEYCOUNT: number of keys

  • TV_INDEXCOUNT: number of indices

  • TV_TRIGGERCOUNT: number of triggers

  • TV_VIEWCOUNT: number of views (should be 0 or 1)

  • TV_TABLE_FILTER: table filter used in the analysis

  • TV_VIEW_FILTER: view filter used in the analysis

  • AN_RUNTIME_TIMESTAMP: execution date of the analysis

  • AN_DURATION: duration of the analysis (ms)

  • REP_DURATION: the duration of the report

  • REP_RUNTIME: execution date of the report

TDQ_SET_INDVALUE

  • SET_INDV_PK: primary key (artificial)

  • TIME_PK: foreign key to dimensional TDQ_DAY_TIME table

  • CAL_PK: foreign key to dimensional calendar table

  • AN_PK: foreign key to dimensional TDQ_ANALYSIS table

  • IND_PK: foreign key to TDQ_INDICATOR_DEFINITION table

  • SIV_INT_VALUE: indicator count value

  • AN_DURATION: analysis duration

  • AN_RUNTIME: execution date of the analysis

  • VERSION: reference to version number in table TDQ_TABLE_ANALYZED_SET

  • SIV_ROW_COUNT: total row count

  • SIV_THRESH_OK: flag Y or N or U which indicates whether this indicator respects the indicator threshold. Y means yes, N means no, U means undefined

  • SIV_THRESH_LOW: lower value of indicator threshold

  • SIV_THRESH_HI: higher value of indicator threshold

  • SIV_THRESH_PERC_OK: flag Y or N or U which indicates whether this indicator respects the defined percentage threshold.Y means yes, N means no, U means undefined

  • SIV_THRESH_PERC_LOW: lower threshold of indicator value in percentage relative to the total row count

  • SIV_THRESH_PERC_HI: higher threshold of indicator value in percentage relative to the total row count

  • TABLE_PK: reference to the table containing the analyzed elements

  • IND_PARENT_PK: link the child indicator to its parent indicator

  • ELT_PK: link to the column

  • REP_DURATION: the duration of the report

  • REP_RUNTIME: execution date of the report

TDQ_VALUES

  • VAL_PK: primary key (artificial)

  • VAL_STRING: value (among the possible values in frequency tables)

TDQ_PRODUCT

  • PR_VERSION: contains the version number of the release of TDQ

TDQ_TABLE_ANALYZED_SET

  • AN_SET_PK: primary key

  • VERSION: version of analysis determined by column set variations

  • ELT_PK: foreign key to dimensional table TDQ_ANALYZED_ELEMENT

  • AN_PK: foreign key to dimensional table TDQ_ANALYSIS

TDQ_MIGRATION

  • ID: primary key (artificial)

  • MIGRATION_NAME: the name of migration

  • PREVIOUS_VERSION: the previous product version

  • UPDATE_VERSION: the updated product version

  • MIGRATION_DATE: the migration date