Processing and moving files located on an FTP server - Cloud

Talend Cloud Apps Connectors Guide

Version
Cloud
Language
English
Product
Talend Cloud
Module
Talend Data Inventory
Talend Data Preparation
Talend Pipeline Designer
Content
Administration and Monitoring > Managing connections
Design and Development > Designing Pipelines
Last publication date
2024-03-21

This scenario aims at helping you set up and use connectors in a pipeline. You are advised to adapt it to your environment and use case.

Procedure

  1. Click Connections > Add connection.
  2. In the panel that opens, select the type of connection you want to create.

    Example

    FTP
  3. Select your engine in the Engine list.
    Note:
    • It is recommended to use the Remote Engine Gen2 rather than the Cloud Engine for Design for advanced processing of data.
    • If no Remote Engine Gen2 has been created from Talend Management Console or if it exists but appears as unavailable which means it is not up and running, you will not be able to select a Connection type in the list nor to save the new connection.
    • The list of available connection types depends on the engine you have selected.
  4. Select the type of connection you want to create.
    Here, select FTP.
  5. Fill in the connection properties to access your FTP server as described in FTP properties, check the connection and click Add dataset.
  6. In the Add a new dataset panel, fill in the required properties to point to the FTP directory in which your file is located and click View sample to see a preview of your dataset sample.
    Here, the file to be retrieved is a CSV file listing restaurants in Baltimore located in a Talend/Files folder:
  7. Click Validate to save your dataset.
  8. On the same FTP connection, add another dataset that will be used as destination in your pipeline. Here you are pointing to a Talend/Out folder.
  9. Click Add pipeline on the Pipelines page. Your new pipeline opens.
  10. Give the pipeline a meaningful name.

    Example

    Processing and moving files on FTP server
  11. Click ADD SOURCE and select your source dataset, restaurant on FTP dir in the panel that opens.
  12. Click to add processors to the pipeline, for example an Aggregate processor to list all the restaurant addresses.
  13. Configure the processor. In the Operations area:
    1. Select .location in the Field path list.
    2. Select List in the Operation list.
    3. Enter the name of the Output field name, here address.
    4. Save your configuration.

    The restaurant addresses have been aggregated in one single record.

  14. Click to add a Normalize processor to the pipeline in order to flatten the address record and split every entry into a separate record.
  15. Configure the processor. In the Operations area:
    1. Select .address in the Field path to normalize list.
    2. Enable the Is list option.
    3. Save your configuration.
  16. Click the ADD DESTINATION item on the pipeline to open the panel allowing to select the FTP output directory in which your output file will be uploaded.
  17. Give a meaningful name to the destination; addresses on FTP out dir for example.
  18. In the Configuration tab of the destination, check that the file you want to upload does not exceed the size limit.
  19. Click Save to save your configuration.
  20. On the top toolbar of Talend Cloud Pipeline Designer, click the Run button to open the panel allowing you to select your run profile.
  21. Select your run profile in the list (for more information, see Run profiles), then click Run to run your pipeline.

Results

Your pipeline is being executed, the restaurant data that was stored on an FTP directory has been processed and the output file is uploaded to the FTP target directory you have specified:
  • The FTP target directory with the new uploaded file:

  • The CSV output file with the list of restaurant addresses: