Arranging data flow for the KMeans Job - 7.0

Machine Learning

author
Talend Documentation Team
EnrichVersion
7.0
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Real-Time Big Data Platform
task
Data Governance > Third-party systems > Machine Learning components
Data Quality and Preparation > Third-party systems > Machine Learning components
Design and Development > Third-party systems > Machine Learning components
EnrichPlatform
Talend Studio

Procedure

  1. In the Integration perspective of the Studio, create an empty Job from the Job Designs node in the Repository tree view.
    For further information about how to create a Job, see Talend Open Studio for Big Data Getting Started Guide .
  2. In the workspace, enter the name of the component to be used and select this component from the list that appears.
  3. Connect tFileInputDelimited to tReplicate using the Row > Main link.
  4. Do the same to connect tReplicate to tModelEncoder and then tModelEncoder to tKMeansModel.
  5. Repeat the operations to connect tReplicate to tPredict and then tPredict to tFileOutputDelimited.
  6. Leave tHDFSConfiguration as it is.