Talend Data Preparation is able to connect to various databases and use them as source to create
a new dataset.
By default, Talend Data Preparation offers connectivity to the following databases:
- AthenaDB
- Azure SQL
- Derby
- MariaDB
- MySQL
- PostgreSQL
- Redshift
- Snowflake
- SQL Server
The list of drivers used for these preconfigured connections can be found in the
<TDP_installation_path>/dataprep/services/tcomp/config/jdbc_config.json
file.
In this example, you want to prepare some customers data that is stored on a MySQL
database. You will enter your database connection information, directly in
the Talend Data Preparation interface and create a new dataset from this data.
Procedure
-
In the Datasets view of the Talend Data Preparation homepage, click the white arrow next to the
Add Dataset button.
-
Select Database.
The Add database
dataset form opens.
-
In the Dataset name field, enter the name you want to
give your dataset.
-
In the Database type drop-down list, select the type of
database you want to connect to, MySQL in this
example.
This list can be manually enriched. For more information,
see "Adding a new database type".
-
In the JDBC URL field, provide a URL to access your
MySQL database.
The form provides a URL template where you can adapt the
values to match your own connection details:
- Replace
localhost
with
your IP address.
- Replace
3306
with the
port that you have set for MySQL. 3306
is the default port for MySQL.
- Replace
db
with the
name of the database you want to connect to.
-
In the Username and Password
fields, enter your MySQL connection information.
-
Click Test connection.
If the connection is successful, the second part of the form is displayed, where
you can enter a query for your database. If not, an error
message is displayed, detailing why the connection
failed.
Make sure that MySQL authorizes connection
from Talend Data Preparation.
-
In the Query field, enter the query for the information
that you want to retrieve from the table stored in your database.
In this case, all the information from the table called
customers
will be retrieved and output as a dataset.
-
Click the Add dataset button at the end of the
form.
Results
The data extracted from the
customers
table in
your MySQL database directly opens in the grid and you can start working on
your preparation the same way you usually do.
The data is still stored in
the MySQL database, Talend Data Preparation only retrieves a sample on-demand.
The dataset is added to
the list in the Datasets view of the
homepage.