Loading the data from the local file - 7.3

HDFS

Version
7.3
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > File components (Integration) > HDFS components
Data Quality and Preparation > Third-party systems > File components (Integration) > HDFS components
Design and Development > Third-party systems > File components (Integration) > HDFS components
Last publication date
2024-02-21

Procedure

  1. Double-click tHDFSPut to define the component in its Basic settings view.
  2. Select, for example, Apache 0.20.2 from the Hadoop version list.
  3. In the NameNode URI, the Username and the Group fields, enter the connection parameters to the HDFS. If you are using WebHDFS, the location should be webhdfs://masternode:portnumber; WebHDFS with SSL is not supported yet.
  4. Next to the Local directory field, click the [...] button to browse to the folder with the file to be loaded into the HDFS. In this scenario, the directory has been specified while configuring tFileOutputDelimited: C:/hadoopfiles/putFile/.
  5. In the HDFS directory field, type in the intended location in HDFS to store the file to be loaded. In this example, it is /testFile.
  6. Click the Overwrite file field to stretch the drop-down.
  7. From the menu, select always.
  8. In the Files area, click the plus button to add a row in which you define the file to be loaded.
  9. In the File mask column, enter *.txt to replace newLine between quotation marks and leave the New name column as it is. This allows you to extract all the .txt files in the specified directory without changing their names. In this example, the file is in.txt.