Configuring the connection to the file system to be used by Spark - 6.5
Data mapping
- EnrichVersion
- 6.5
- EnrichProdName
- Talend Big Data Platform
- Talend Data Fabric
- Talend Data Management Platform
- Talend Data Services Platform
- Talend MDM Platform
- Talend Real-Time Big Data Platform
- EnrichPlatform
- Talend Studio
- task
- Data Governance > Third-party systems > Processing components (Integration) > Data mapping
- Data Quality and Preparation > Third-party systems > Processing components (Integration) > Data mapping
- Design and Development > Third-party systems > Processing components (Integration) > Data mapping
Procedure
-
Double-click tHDFSConfiguration to open its
Component view.
-
In the Version area, select the Hadoop
distribution you need to connect to and its version.
-
In the NameNode URI
field, enter the location of the machine hosting the NameNode service of the
cluster. If you are using WebHDFS, the location should be
webhdfs://masternode:portnumber; if this WebHDFS is secured
with SSL, the scheme should be swebhdfs and you need to use
a tLibraryLoad in the Job to load the library required by
the secured WebHDFS.
-
In the Username field, enter the authentication
information used to connect to the HDFS system to be used.