StepĀ 3: Extracting change data

Change Data Capture

author
Talend Documentation Team
EnrichVersion
6.4
EnrichProdName
Talend Data Services Platform
Talend Data Integration
Talend Data Fabric
Talend Big Data
Talend Big Data Platform
Talend Data Management Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
task
Data Quality and Preparation > Third-party systems > Database components > Change Data Capture
Data Governance > Third-party systems > Database components > Change Data Capture
Design and Development > Third-party systems > Database components > Change Data Capture
EnrichPlatform
Talend Studio

About this task

After setting up the CDC environment, you can now design a job using the Oracle CDC component to incrementally extract the change data from the LEADFACT table. To do that:

Procedure

  1. From the Palette, drop the OracleCDC and tLogRow components to the design workspace.
  2. Link the two components using a Row Main link.
  3. Double-click tOracleCDC to open its Basic settings view and define its properties.
  4. Set Property Type to Repository and then select the schema corresponding to your Oracle DB table, cdc_publisher in this scenario. The connection details will display automatically in the corresponding fields
    Note:

    If you have not stored the data warehouse connection details in the Metadata folder in the Repository tree view, select Built-in in the property type list and set the connection details manually.

  5. In the Schema using CDC field, select Repository and then select the schema of the LEADFACT table stored in the Metadata folder.
  6. In the Table using CDC field, enter the name of the table captured by the CDC, in this scenario Leadfact.
  7. In the Events to catch field, select the check boxes corresponding to the type of the modified data the subscriber will extract. In this scenario, select the three check boxes for the three subscribers.
  8. Double-click tLogRow to display its Basic settings view and define its properties.
  9. Click the Sync columns button to retrieve the schema from the preceding component.
  10. Click Edit schema to open the schema dialog box.
  11. In the TALEND_CDC_CREATION_DATE line of the Date Pattern column, enter between brackets the desired date format: "yyyy-MM-dd".
  12. Save your Job and press F6 to execute it.

Results

In the Redo log mode, changes done on data are indicated in the following way: modifications are equal to first, an "update and delete" operation (UO), and then to an "update and insert" operation (UN). Thus, client data displays twice:

- First, data is deleted (UO).

-Second, data is inserted (UN).

Once these modifications are extracted, they are no more available in the modified table. To verify their extraction, right-click the LEADFACT table catched by the CDC and then select Views All Changes. The extracted changes do not display anymore.

For another CDC scenario using the Trigger mode, see Retrieving modified data using CDC.