Arranging data flow for the KMeans Job - 7.3

Machine Learning

Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Real-Time Big Data Platform
Talend Studio
Data Governance > Third-party systems > Machine Learning components
Data Quality and Preparation > Third-party systems > Machine Learning components
Design and Development > Third-party systems > Machine Learning components


  1. In the Integration perspective of the Studio, create an empty Job from the Job Designs node in the Repository tree view.
    For further information about how to create a Job, see Talend Open Studio for Big Data Getting Started Guide .
  2. In the workspace, enter the name of the component to be used and select this component from the list that appears.
  3. Connect tFileInputDelimited to tReplicate using the Row > Main link.
  4. Do the same to connect tReplicate to tModelEncoder and then tModelEncoder to tKMeansModel.
  5. Repeat the operations to connect tReplicate to tPredict and then tPredict to tFileOutputDelimited.
  6. Leave tHDFSConfiguration as it is.