Testing the KMeans model

Machine Learning

author
Talend Documentation Team
EnrichVersion
6.5
EnrichProdName
Talend Real-Time Big Data Platform
Talend Data Fabric
Talend Big Data
Talend Big Data Platform
task
Data Quality and Preparation > Third-party systems > Machine Learning components
Data Governance > Third-party systems > Machine Learning components
Design and Development > Third-party systems > Machine Learning components
EnrichPlatform
Talend Studio

Procedure

  1. Double-click tPredict to open its Component view.
  2. Select the Define a storage configuration component check box and select the tHDFSConfiguration component to be used.
  3. From the Model type drop-down list, select Kmeans model.
  4. Select the Model on filesystem radio button and enter the directory in which the KMeans model is stored.
    In this case, the tPredict component contains a read-only column called label in which the model provides the labels of the clusters.
  5. Double-click tFileOutputDelimited to open its Component view.
  6. Select the Define a storage configuration component check box and select the tHDFSConfiguration component to be used.
  7. In the Folder field, browse to the location in HDFS in which you want to store the prediction result.
  8. From the Action drop-down list, select Overwrite. But if target folder does not exist, select Create.
  9. Select the Merge result to single file check box and then the Remove source dir check box.
  10. In the Merge file path field, browse to the location in HDFS in which you want to store the merged prediction result.