Arranging data flow for the KMeans Job - Cloud - 8.0

Machine Learning

Version
Cloud
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Machine Learning components
Data Quality and Preparation > Third-party systems > Machine Learning components
Design and Development > Third-party systems > Machine Learning components
Last publication date
2024-02-20

Procedure

  1. In the Integration perspective of Talend Studio, create an empty Job from the Job Designs node in the Repository tree view.
  2. In the workspace, enter the name of the component to be used and select this component from the list that appears.
  3. Connect tFileInputDelimited to tReplicate using the Row > Main link.
  4. Do the same to connect tReplicate to tModelEncoder and then tModelEncoder to tKMeansModel.
  5. Repeat the operations to connect tReplicate to tPredict and then tPredict to tFileOutputDelimited.
  6. Leave tHDFSConfiguration as it is.