Installing Talend Data Preparation manually

This procedure contains the steps to manually install Talend Data Preparation on your machine.

Before you begin

Talend Administration Center is installed and running.
Talend Identity and Access Management is installed and running.
A Talend Data Preparation user exists in Talend Administration Center. For more information, see Talend Administration Center User Guide.
There are no other instances of MongoDB installed on your machine.
To use Talend Data Preparation with Big Data, use one of the supported Hadoop distribution. For more information, see Supported Hadoop distribution versions for Talend Data Preparation with Big Data.
Before installing Talend Data Preparation, make sure that you fulfill the hardware and software requirements. For more information, see On-premises installation prerequisites.

Procedure

Download a MongoDB instance from https://www.mongodb.com/download-center and install it.
For more information on the supported MongoDB databases, see Compatible databases.

For more information on how to install it, see MongoDB documentation.

If you want to secure connections with MongoDB using SSL, MongoDB Enterprise Server has to be manually installed on your machine. For more information, see https://docs.mongodb.com/v4.0/security/.
Unzip the Talend-DataPreparation-Server-VA.B.C.zip file where you want Talend Data Preparation to be installed.
Unzip the <Data_Preparation_Path>/services/components-api-service-rest-all-components-VA.B.C.zip file where you want Components Catalog to be installed.
To use Talend Data Preparation in a Big Data context, you need to install two additional tools, Streams Runner and Spark Job Server.
Note that Streams Runner and Spark Job Server must be installed on a Linux machine.
1. Unpack <Data_Preparation_Path>/services/data-streams-streamsrunner-svc-A.B.C.tgz file where you want Streams Runner to be installed.
2. Unpack the <Data_Preparation_Path>/services/spark-jobserver-A.B.C.tar.gz file where you want Spark Job Server to be installed. This file contains Spark Job Server plus all the required dependencies.
  Additionally, you must have already installed curl, a command-line tool and library for transferring data with URLs. You can download it from https://curl.haxx.se/ if needed.
Add mongo to the PATH environment variable.
Create the dataprep database in MongoDB using the following command: use dataprep.
Create the following user for the dataprep database in MongoDB:
- Username: dataprep-user
- Password: duser
To do this, you can use the following command:
```
db.createUser( { user: "dataprep-user", pwd: "duser", roles: [{ role: "readWrite", db: "dataprep"}]})
```
You can automatically create the user and password by executing the <Data_Preparation_Path>/create_mongo_user.sh file.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!

Leave your feedback here