For more technologies supported by Talend, see Talend components.
tModelEncoder: several tModelEncoder components are used to transform given SMS text messages into feature sets.
tRandomForestModel: it analyzes the features incoming from tModelEncoder to build a classification model that understands what a junk message or a normal message could look like.
tClassify: in a new Job, it applies this classification model to process a new set of SMS text messages to classify the spam and the normal messages. In this scenario, the result of this classification is used to evaluate the accuracy of the model, since the classification of the messages processed by tClassify is already known and explicitly marked.
A configuration component such as tHDFSConfiguration in each Job: this component is used to connect to the file system to which the jar files dependent on the Job are transferred during the execution of the Job.
This file-system-related configuration component is required unless you run your Spark Jobs in the Local mode.