Tracking inserted data changes and writing the changes into a SCD dimension table - 6.5

SCDELT

author
Talend Documentation Team
EnrichVersion
6.5
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance > Third-party systems > Business Intelligence components > SCDELT components
Data Quality and Preparation > Third-party systems > Business Intelligence components > SCDELT components
Design and Development > Third-party systems > Business Intelligence components > SCDELT components
EnrichPlatform
Talend Studio

Procedure

  1. Double-click the first tPostgreSQLSCDELT component to open its Basic settings view.
  2. Select the Use an existing connection check box and from the Component List drop-down list displayed, select the connection component to reuse the connection created by it, tPostgreSQLConnection_1 in this example.
  3. In the Source table field, enter the name of the table whose data changes will be captured, employee in this example.
  4. In the Table field, enter the name of the SCD dimension table that will store both the current and historical employee data, employee_scd in this example.
  5. Select Create table from the Action on table drop-down list to create the SCD dimension table.
  6. Click the [...] button next to Edit schema and in the pop-up dialog box, define the schema by adding nine columns: sk ( as the primary key) and id of Integer type, name and role of String type, salary of Double type, start_date and end_date of Date type with the Date Pattern dd-MM-yyyy, and active_status and version of Integer type. When done, click OK to save the changes and close the dialog box.
  7. From the Surrogate key drop-down list, select the name of the column that will be used as the primary key of the SCD dimension table, sk in this example.
  8. Select DB sequence from the Creation drop-down list and in the Sequence field displayed, enter the name of the PostgreSQL sequence used to generate the surrogate key for the SCD Type 2 method, employee_sequence in this example.
  9. Click the [+] button below the Source keys table to add a new line, and click the Name cell and select the key column of the source table from the drop-down list, id in this example.
  10. Select the Use SCD type 1 fields check box, click the [+] button below the SCD type 1 fields table twice to add two lines. Then click each cell and from the drop-down list, select the column on which the SCD Type 1 method will be performed. In this example, they are name and role.
  11. Select the Use SCD type 2 fields check box, click the [+] button below the SCD type 2 fields table to add a line. Then click the cell and select the column on which the SCD Type 2 method will be performed. In this example, it is salary.
  12. From the Start date and End date drop-down lists, select the columns used to hold the start date and end date values for the SCD Type 2 method respectively, start_date and end_date in this example.
  13. Select the Log active status check box and from the Active field drop-down list displayed, select the column used to hold the active status value for the SCD Type 2 method, which helps identify the active records, active_status in this example.
  14. Select the Log versions check box and from the Version field drop-down list, select the column used to hold the version number of the records for the SCD Type 2 method, version in this example.