tMahoutClustering (deprecated)

Machine Learning

author
Talend Documentation Team
EnrichVersion
6.5
EnrichProdName
Talend Big Data Platform
Talend Data Fabric
Talend Real-Time Big Data Platform
Talend Big Data
task
Data Quality and Preparation > Third-party systems > Machine Learning components
Data Governance > Third-party systems > Machine Learning components
Design and Development > Third-party systems > Machine Learning components
EnrichPlatform
Talend Studio

Groups unlabeled numerical data into clusters that can reveal interesting patterns or helps identifying abnormal data items in the data set.

tMahoutClustering groups data together into clusters based on some similarities. The component offers several similarity methods that can be used in different clustering algorithms.

tMahoutClustering uses clustering algorithms from Mahout libraries. All processes are run in a given distributed file system.

Note:

Currently, the studio supports Mahout 0.9.

For more technologies supported by Talend, see Talend components.