Adding a dataset from a database - Cloud

Talend Cloud Data Preparation User Guide

author
Talend Documentation Team
EnrichVersion
Cloud
EnrichProdName
Talend Cloud
task
Data Quality and Preparation > Cleansing data
EnrichPlatform
Talend Data Preparation

Talend Cloud Data Preparation is able to connect to various databases and use them as source to create a new dataset.

By default, Talend Cloud Data Preparation offers connectivity to the following databases:

  • MariaDB
  • MySQL
  • Derby
  • PostgreSQL
  • SQL Server
  • Azure SQL
  • Snowflake
  • Redshift

In this example, you want to prepare some customers data that is stored on a MySQL database. You will enter your database connection information, directly in the Talend Cloud Data Preparation interface and create a new dataset from this data.

Procedure

  1. In the Datasets view of the Talend Cloud Data Preparation homepage, click the white arrow next to the Add Dataset button.
  2. Select Database.

    The Add database dataset form opens.

  3. In the Dataset name field, enter the name you want to give your dataset.
  4. In the Database type drop-down list, select the type of database you want to connect to, MySQL in this example.
  5. In the JDBC URL field, provide a URL to access your MySQL database.

    The form provides a URL template where you can adapt the values to match your own connection details:

    • Replace localhost with your IP address.
    • Replace 3306 with the port that you have set for MySQL. 3306 is the default port for MySQL.
    • Replace db with the name of the database you want to connect to.
  6. In the Username and Password fields, enter your MySQL connection information.
  7. Click Test connection.
    If the connection is successful, the second part of the form is displayed, where you can enter a query for your database. If not, an error message is displayed, detailing why the connection failed.

    Make sure that MySQL authorizes connection from Talend Cloud Data Preparation.

  8. In the Query field, enter the query for the information that you want to retrieve from the table stored in your database.

    In this case, all the information from the table called customers will be retrieved and output as a dataset.

  9. Click the Add dataset button at the end of the form.

Results

The data extracted from the customers table in your MySQL database directly opens in the grid and you can start working on your preparation the same way you usually do.

The data is still stored in the MySQL database, Talend Cloud Data Preparation only retrieves a sample on-demand.

The dataset is added to the list in the Datasets view of the homepage.