Hortonworks - Getting Started - 6.1

Hortonworks - Getting Started

author
Louis Frolio
EnrichVersion
6.4
6.3
6.2
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Open Studio for Big Data
Talend Real-Time Big Data Platform
task
Design and Development > Designing Jobs > Hadoop distributions > Hortonworks
EnrichPlatform
Talend Studio
This article demonstrates how to get started with Hortonworks 2.4.

Prerequisites

  • You have installed and configured Hortonworks 2.4 cluster (HDP).

    You can also use Hortonworks (sandbox), a downloadable virtual machine (VM).

  • You have installed Talend Studio.
  • The dataset used (pearsonData.csv) in this article is called Pearson’s Height Data, named for its creator Karl Pearson who, in the early 1900’s, founded the Mathematical Statistics discipline.

    You can download the Pearson dataset here. Feel free to use your own data, being mindful that aspects of this article will need to be adjusted.