Skip to main content

HDFS components

tHDFSCompare Compares two files in HDFS and based on the read-only schema, generates a row flow that presents the comparison information.
tHDFSConfiguration Enables the reuse of the connection configuration to HDFS in the same Job.
tHDFSConnection Connects to a given HDFS so that the other Hadoop components can reuse the connection it creates to communicate with this HDFS.
tHDFSCopy Copies a source file or folder into a target directory in HDFS and removes this source if required.
tHDFSDelete Deletes a file located on a given Hadoop distributed file system (HDFS).
tHDFSExist Checks whether a file exists in a specific directory in HDFS.
tHDFSGet Copies files from Hadoop distributed file system(HDFS), pastes them in a user-defined directory and if needs be, renames them.
tHDFSInput Extracts the data in a HDFS file for other components to process it.
tHDFSList tHDFSList retrieves a list of files or folders based on a filemask pattern and iterates on each unity.
tHDFSOutput Writes data flows it receives into a given Hadoop distributed file system (HDFS).
tHDFSOutputRaw Transfers data of different formats such as hierarchical data in the form of a single column into a given HDFS file system.
tHDFSProperties Creates a single row flow that displays the properties of a file processed in HDFS.
tHDFSPut Connects to Hadoop distributed file system to load large-scale files into it with optimized performance.
tHDFSRename Renames the selected files or specified directory on HDFS.
tHDFSRowCount Reads a file in HDFS row by row in order to determine the number of rows this file contains.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!