Configuring the connection to the file system to be used by Spark - 6.5

Data mapping

EnrichVersion
6.5
EnrichProdName
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
EnrichPlatform
Talend Studio
task
Data Governance > Third-party systems > Processing components (Integration) > Data mapping
Data Quality and Preparation > Third-party systems > Processing components (Integration) > Data mapping
Design and Development > Third-party systems > Processing components (Integration) > Data mapping

Procedure

  1. Double-click tHDFSConfiguration to open its Component view.
  2. In the Version area, select the Hadoop distribution you need to connect to and its version.
  3. In the NameNode URI field, enter the location of the machine hosting the NameNode service of the cluster. If you are using WebHDFS, the location should be webhdfs://masternode:portnumber; if this WebHDFS is secured with SSL, the scheme should be swebhdfs and you need to use a tLibraryLoad in the Job to load the library required by the secured WebHDFS.
  4. In the Username field, enter the authentication information used to connect to the HDFS system to be used.