Tracking data insertion changes and writing the changes into a SCD dimension table - Cloud - 8.0

SCDELT

Version
Cloud
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Business Intelligence components > SCDELT components
Data Quality and Preparation > Third-party systems > Business Intelligence components > SCDELT components
Design and Development > Third-party systems > Business Intelligence components > SCDELT components
Last publication date
2024-02-20

Procedure

  1. Double-click the first tJDBCSCDELT component to open its Basic settings view.
  2. Select the Use an existing connection check box and from the Component List drop-down list displayed, select the connection component to reuse the connection created by it, tJDBCConnection_1 in this example.
  3. In the Source table field, enter the name of the table whose data changes will be captured, employee in this example.
  4. In the Table field, enter the name of the SCD dimension table that will store both the current and historical employee data, employee_scd in this example.
  5. Select Drop table if exists and create from the Action on table drop-down list to create the SCD dimension table.
  6. Click the [...] button next to Edit schema and in the pop-up dialog box, define the schema by adding nine columns: sk and id of Integer type as the primary key, name and role of String type, salary of Double type, start_date and end_date of Date type with the Date Pattern yyyy-MM-dd, and active_status and version of Integer type. When done, click OK to save the changes and close the dialog box.
  7. From the Surrogate key drop-down list, select the name of the column that will be used as the primary key of the SCD dimension table, sk in this example.
  8. Select DB sequence from the Creation drop-down list and in the Sequence field displayed, enter the name of the Snowflake sequence used to generate the surrogate key for the SCD Type 2 method.
  9. Click the [+] button below the Source keys table to add a new line, and click the Name cell and select the key column of the source table from the drop-down list, id in this example.
  10. Select the Use SCD type 1 fields check box, click the [+] button below the SCD type 1 fields table twice to add two lines. Then click each cell and from the drop-down list, select the column on which the SCD Type 1 method will be performed. In this example, they are name and role.
  11. Select the Use SCD type 2 fields check box, click the [+] button below the SCD type 2 fields table to add a line. Then click the cell and select the column on which the SCD Type 2 method will be performed. In this example, it is salary.
  12. From the Start date and End date drop-down lists, select the columns used to hold the start date and end date values for the SCD Type 2 method respectively, start_date and end_date in this example.
  13. Select the Log active status check box and from the Active field drop-down list displayed, select the column used to hold the active status value for the SCD Type 2 method, which helps identify the active records, active_status in this example.
  14. Select the Log versions check box and from the Version field drop-down list, select the column used to hold the version number of the records for the SCD Type 2 method, version in this example.
  15. Select Mapping Snowflake from the Mapping drop-down list to use the Snowflake metadata mapping file.