Since version 5.4.2, the Studio does not automatically create, modify or delete any journal and can only run the CDC process on the basis of the journal and receivers that you or the administrator of your AS/400 system can provide depending on the policy of your company. For this reason, ensure that an old receiver has been treated by RUNCDC before deleting it so as to avoid lost of the information recorded in that receiver.
The following two sections describe how to set up a CDC environment in Talend Studio. The contents described include:
the prerequisites on AS/400;
how to configure your system for CDC usage;
how to extract the modified data;
the prerequisites for reusing a CDC environment migrated from one of the 5.4.1 or earlier versions of the Studio.
Prior to setting up CDC in Redo/Archive log mode (journal) on AS/400, you need to verify the prerequisites as follows on your AS/400:
OS AS/400 V5R3M0, V5R4M0 or V6R1M0 is used;
the AS/400 user account for CDC must have *ALLOBJ privileges or at least all of the following privileges:
- FTP (access to the FTP port must be ensured),
- READ access on journal receivers,
- READ access on monitored AS/400 files,
- READ/WRITE access on output library;
the names of the files of interest should not exceed 10 characters;
if the files of interest are already journalized, the journal must be created with option IMAGES (*BOTH)
For further information about the setup of these listed prerequisites, see the manual of your AS/400 system.
The following configuration steps only need to be set up once for a given publisher/subscriber scenario.
To set up a publisher you need to:
Create a new Job in Talend Studio.
Set a DB connection dedicated to CDC by filling in DB connection data. For example, a connection called AS400_CDC.
Set a DB connection to where data is located by filling in DB connection data. For example, a connection called AS400_DATA.
To identify the table(s) to catch, right-click the newly created data connection to retrieve the schema of the source table and load it in the repository. In this example, this data connection is AS400_DATA.
To set the connection between CDC and the data :
Right-click the CDC Foundation folder of the data connection and select Create CDC to open the [Create Change Data Capture] dialog box. In this example, this data connection is AS400_DATA.
In the [Create Change Data Capture] dialog box, click the three-dot button next to the Set Link Connection field to select connection to the database that corresponds to CDC. In this example, select AS400_CDC.
Click Create Subscriber to create the subscribers. Then the command to be executed is displayed. The following image presents an example of this command.
In general, this command reads as follows:
open <AS400_server_host> user <Username> <Password> quote rcmd "crtsavf qgpl/instfitcdc" quote rcmd "clrsavf qgpl/instfitcdc" bin cd qgpl put "<Studio_install>\plugins\org.talend.designer.cdc_<version>\resource\fitcdc.savf" instfitcdc quote rcmd "rstlib savlib(fitcdc) dev(*savf) savf(qgpl/instfitcdc) RSTLIB(<CDC_library_name>)" quote rcmd "CHGCMD CMD(<CDC_library_name>/RUNCDC) PGM(<CDC_library_name>/F2CD00) CURLIB(<CDC_library_name>)" quote rcmd "dltf qgpl/instfitcdc" quit
It is automatically executed via FTP by the Studio to install the RUNCDC program, restore the CDC library (the CDC database) and create the TSUBSCRIBERS table.
If you need to manually execute this command, copy this command and click Skip to close this dialog box. In this situation, this command is not executed by the Studio and you need to paste or even edit this command by yourself and execute it in your AS/400 system.
Otherwise, click Execute to directly run the default command in the Studio. Then a step-by-step execution list appears.
Note that on the list, you might read an error with number 550 describing issues such as the fact that not all objects have been restored. This could be normal if the library that was not restored has in fact been restored in your AS/400 system. Contact the administrator of your AS/400 system for clarification.
Once done, in the [Create Change Data Capture] dialog box, click Finish.
In the CDC Foundation folder, the CDC database connection appears, along with the subscription table schema.
You must specify the table to which the subscriber wants to subscribe and then activate the subscription.
Right-click the schema that corresponds to the source table and select Add CDC. The [Create Subscriber and Execute SQL Script] dialog box displays. The long name and the short name of the source table are both displayed in this dialog box.
The source table to be monitored should have a primary key so that the CDC system can identify the lines on which changes have been done. You cannot set up a CDC environment if the schema of your source table does not have a primary key.
In this example, since the long name CUSTOMERS does not exceed 10 characters, the short name reads the same as the long name.
Note that if you are using a source table retrieved in one of the 5.4.1 or earlier versions of the Studio, only its long name has been retrieved. Then you have to retrieve this table again from the current Studio.
In the Subscriber Name field, enter the name you want to give the subscriber. By default, the subscriber name is APP1.
Click Execute and then Close.
In the CDC Foundation folder, the change table schema and the associated view appear.
From your AS/400 system:
Create a new receiver:
Create a new journal and attach the receiver created in the previous step:
For the file to be monitored, start journaling changes into the journal created in the previous step:
STRJRNPF FILE(<source_library_name>/<file_to_be_monitored)JRN(<source_library_name>/<journal_name) IMAGES(*BOTH)
To view any changes made to the data, right-click the relevant table in the Table schemas folder and select View All Changes to open the relevant dialog box.
Click Finish to close the dialog box.
Once you have defined the CDC environment, you can create a Job in Talend Studio using a CDC AS/400 component, in order to extract the changes made to the data in the source system.
For an example of how to use a CDC component and for more information regarding the properties and the parameters of the tAS400CDC component, see Talend Components Reference Guide.
If you want to use CDC with an AS/400 database, proceed as follows:
Drop tAS400CDC and tLogRow from the Palette onto the design workspace, and link the two components using a Row Main connection.
Double-click on tAS400CDC to display its basic settings.
Select Repository from the Property type drop-down list and click on [...] to fetch the schema which corresponds to your CDC connection. The fields which follow are automatically filled in with the information required to connect to the CDC database.
Select Repository from the Schema drop-down list and click on [...] to fetch the schema which corresponds to the AS/400 table to be monitored.
In the Table Name field, enter the name of the source table monitored by CDC, here CUSTOMERS.
In the Source Library field, enter the name of the source library. By default, this is the same name of the source database.
In the Subscriber field, enter the name of the subscriber who will extract the modified data. By default, the subscriber is named APP1.
In the Events to catch field, select the check box which corresponds to the event(s) to be caught.
From your AS/400 system, execute
CHGJRN JRN(<Source_library_name>/<Source_table_name>) JRNRCV(*GEN)
Alternatively, in the Advanced settings view, select the Customize FTP command check box and enter
<CDC_library_name>/RUNCDC FILE(<Source_library_name>/<Source_table_name>) LIBOUT(<CDC_library_name>) MODE(*DETACHED) MBROPT(*ADD) DTCHJRN(*YES)
This command allows tAS400CDC to detach the older receiver from the journal and create and attach the newer receiver to that journal.
Save the Job and press F6 to run it.
In the console, you can read the output results which correspond to what is displayed in the [View All Changes] dialog box.
If you migrate an established CDC environment from one of the 5.4.1 or earlier versions into the current Studio version, you need to accomplish the following prerequisites before reusing that environment, because the behavior of the Studio for handling CDC has been updated:
The long name and the short name of an AS/400 table are both retrieved with the table schema in the Repository. The CDC table uses automatically the short name as its own name. This means that you have to retrieve your AS/400 table again after the migration in order to have both the source table and the CDC table recognized by the Studio.
The structure of the TSUBSCRIBERS table has been updated in order to contain the long name and the short name of a source table. Therefore, you need to delete the existing CDC and add new CDC to reinitialize your TSUBSCRIBERS table.
The Studio does not create, modify or delete any journal and consequently, cannot automatically detach an older receiver from a journal and attach a newer one to it. Since this detachment and attachment process is indispensable for the Studio to take the last change into account, you have to execute the following command in the AS/400 system:
CHGJRN JRN(<Source_library_name>/<Source_table_name>) JRNRCV(*GEN)
Alternatively, you can as well use a custom FTP command through the tAS400CDC component to automate this process.
For further information about how to use this Customize FTP command feature in tAS400CDC, see Talend Components Reference Guide.