List of configuration parameters for Talend Data Preparation - 7.3

Talend Data Preparation User Guide

Version
7.3
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Data Preparation
Content
Data Quality and Preparation > Cleansing data
Last publication date
2023-11-28

All parameters in the application.properties file are set by default during the installation of Talend Data Preparation by Talend Installer. However, you can customize them according to your installation environment.

For further information about installing and configuring Talend Data Preparation, see Talend installation guides.

Parameter Description
dataprep.locale Setting for the application interface language.
public.ip=<local machine ip>

server.port=9999

async-runtime.contextPath=/api

IP of the server hosting Talend Data Preparation and server port
server.compression.enabled=true

server.compression.mime-types=text/plain,text/html,text/css,application/json,application/x-javascript,text/xml,application/xml,application/xml+rss,text/javascript,application/javascript,text/x-js

Response compression parameters
iam.ip=<local machine ip>

iam.uri=http://${iam.ip}:9080

iam.api.uri=${iam.uri}

IP of the server hosting Talend Identity and Access Management, used for SSO, and server port
spring.mvc.async.request-timeout=600000 Timeout setting for Asynchronous executions. Do not change this value, unless you were told so by Talend.
dataprep.event.listener=spring Event propagation parameter. Can be Spring or Kafka.
live.dataset.location=tac

live.dataset.url=http://<local machine ip>:8080/org.talend.administrator/

Parameters related to the Live dataset feature. URL to the Talend Administration Center instance used to list execution tasks as dataset sources.
live.dataset.task-prefix=dataprep_ Prefix used to list Talend Administration Center tasks in the Talend Data Preparation interface and create Live datasets. Only the tasks with this prefix will be listed when importing data with the Talend Job option.
mongodb.host=<local machine ip>

mongodb.port=27017

mongodb.database=dataprep

mongodb.user=dataprep-user

mongodb.password=<randomly generated password>

multi-tenancy.mongodb.active=true

MongoDB settings
mongodb.uri= For more complex use cases, mongo.* configurations can be overwritten by specifying a URI directly.
mongodb.ssl=true

mongodb.ssl.trust-store=/path/to/trust-store.jks

mongodb.ssl.trust-store-password=trust-store-password

Uncomment these parameters to set up a secured connexion with MongoDB
tls.key-store=/path/to/key-store.jks

tls.key-store-password=key-store_password

tls.trust-store=/path/to/trust-store.jks

tls.trust-store-password=trust-store_password

tls.verify-hostname=false

Uncomment these parameters to set up a secured https connexion for Talend Data Preparation
security.provider=oauth2

security.token.secret=encrypted password

Authentication parameters
spring.profiles.active=server-standalone

spring.mvc.favicon.enabled=false

spring.http.multipart.maxFileSize=200000000

spring.http.multipart.maxRequestSize=200000000

Spring parameters. Do not change these values, unless you were told so by Talend.
service.documentation.name=Talend Data Preparation - API

service.documentation.description=This service exposes high level services that may involve services orchestration.

service.paths=api

springfox.documentation.swagger.v2.host=${public.ip}:${server.port}${gateway-api.service.path}
Set these parameters to enable access to Swagger
dataset.records.limit=10000

dataset.local.file.size.limit=2000000000

dataset.imports=local,job,tcomp-JDBCDatastore,tcomp-SimpleFileIoDatastore,tcomp-SalesforceDatastore,tcomp-S3Datastore

dataset.list.limit=10

Size limit and display parameters for your datasets
dataset.service.url=http://${public.ip}:${server.port}

transformation.service.url=http://${public.ip}:${server.port}

preparation.service.url=http://${public.ip}:${server.port}

fullrun.service.url=http://${public.ip}:${server.port}

async_store.service.url=http://${public.ip}:${server.port}

gateway.service.url=http://${public.ip}:${server.port}

Address of the dataset service
dataset.metadata.store=mongodb

preparation.store=mongodb

user.data.store=mongodb

folder.store=mongodb

upgrade.store=mongodb

File storage service configuration parameters. Do not change these values, unless you were told so by Talend.
content-service.store=local

content-service.store.local.path=data/

content-service.journalized=true

Location for cache and content storage
preparation.store.remove.hours=24 Preparation service configuration. Do not change these values, unless you were told so by Talend.
lock.preparation.store=mongodb

lock.preparation.delay=600

Lock duration parameter in seconds, when working on shared preparations
luceneIndexStrategy=singleton Lucene index configuration. Do not change these values, unless you were told so by Talend.
execution.store=mongodb

async.operation.concurrent.run=5

Parameters for asynchronous full run and sampling operations, namely storage and number of allowed concurrent runs. Do not change the mongodb value, unless you were told so by Talend. Regarding asynchronous operations, if there are more full run operations than the parameter's value running in parallel, the operations will be queued, and will resume when there is an available slot. You can increase the value of this parameter, according to your machine's power.
tcomp.server.url=http://<local machine ip>:8989/tcomp URL of the server hosting the Components Catalog, used to configure self service connectors
tcomp-SimpleFileIoDatastore.kerberosPrincipal.default=${streams.kerberos.principal}

tcomp-SimpleFileIoDatastore.kerberosKeytab.default=${streams.kerberos.keytab_path}

tcomp-SimpleFileIoDataset.path.default=${streams.hdfs.server.url}

Components Catalog configuration properties. Allows you to set your kerberos configuration automatically when importing datasets from HDFS.
tcomp-SimpleFileIoDatastore.test_connection.visible=false Parameter to remove the test connection step from the Talend component form. Do not change this parameter, unless you were told so by Talend
async.operation.watcher.ttl=3600000 Maximum execution time for full runs, in milliseconds
receivers.timeout=3600000 Maximum waiting time for Live datasets input
dataquality.indexes.file.location=data/data-quality/org.talend.dataquality.semantic Data quality indexes storage location. If you change this value, Talend Data Preparation will automatically recreate the indexes at startup, but only for the default ones. In order to retrieve your custom semantic types, you need to copy the content of your old directory, and paste it to the new location.
dataquality.semantic.list.enable=true

dataquality.server.url=http://<local machine ip>:8187/

Parameter to activate the semantic type edition in the Talend Data Preparation interface, and url of the server hosting Talend Dictionary Service
tsd.consumer.enabled=true

tsd.consumer.semantic-topic-content=raw

dataquality.event.store=mongodb

spring.cloud.stream.kafka.binder.brokers=tal-rd44.talend.lan

Data quality updates parameters
streams.enable=false

streams.flow.runner.url=http://<local machine ip>:<Big data preparation port>/

streams.kerberos.principal=<principal>

streams.kerberos.keytab_path=<keytab path>

streams.hdfs.server.url=hdfs://<host>:<port>/<filepath>

Streams Runner configuration parameters

Enable these parameters to configure Talend Data Preparation with Big Data

security.basic.enabled=false

security.oidc.client.expectedIssuer=http://tal-rd44.talend.lan:9080/oidc

iam.license.url=http://${iam.ip}:9080/oidc/api

security.oidc.client.keyUri=${iam.uri}/oidc/jwk/keys

security.oauth2.client.clientId=64xIVPxviKWSog

security.oauth2.client.clientSecret=9C0zCjp8yS-eZBqEi-KhBQ

security.oidc.client.claimIssueAtTolerance=120

# security.oauth2.resource.serviceId=${PREFIX:}resource

security.oauth2.resource.tokenInfoUri=${iam.uri}/oidc/oauth2/introspect

security.oauth2.resource.uri=/v2/api-docs,/api/**,/folders/**,/datasets/**,/dataset/**,/preparations/**,/transform/**,/version/**,/acl/**,/apply/**,/export,/export/**,/aggregate,/sampling/**,/receivers/**,/error,/docs,/datastores/**,/preparation/**,/actions/**,/suggest/**,/dictionary/**

security.oauth2.resource.filter-order=3

security.scim.enabled=true

security.oauth2.client.access-token-uri=${iam.uri}/oidc/oauth2/token

security.oauth2.client.scope=openid refreshToken

security.oauth2.client.user-authorization-uri=${iam.uri}/oidc/idp/authorize

security.oauth2.sso.login-use-forward=false

server.session.cookie.name=TDPSESSION

spring.session.store-type=hash_map

security.sessions=stateless

security.user.password=none

Single Sign-On security configuration parameters
security.oidc.client.endSessionEndpoint=${iam.uri}/oidc/idp/logout

security.oidc.client.logoutSuccessUrl=http://${public.ip}:${server.port}

security.oauth2.logout.uri=/signOut

security.oauth2.sso.login-path=/signIn

iam.scim.url=http://${iam.ip}:9080/scim/

security.oauth2.resource.tokenInfoUriCache.enabled=true

tenant.account.cache.enabled=true

Single Sign-On properties for Talend Data Preparation API and Gateway
gateway-api.service.url=http://${public.ip}:${server.port}

gateway-api.service.path=/gateway

zuul.servletPath=/gateway/upload

zuul.routes.dq.path=/gateway/dq/semanticservice/**

zuul.routes.dq.sensitiveHeaders=${zuul.sensitiveHeaders}

zuul.routes.dq.url=${dataquality.server.url}/

proxy.auth.routes.dq=oauth2

zuul.routes.api.path=/gateway/api/**

zuul.routes.api.sensitiveHeaders=${zuul.sensitiveHeaders}

zuul.routes.api.url=http://${public.ip}:${server.port}/api

proxy.auth.routes.api=oauth2

zuul.sensitiveHeaders=Cookie,Set-Cookie,Expires,X-Content-Type-Options,X-Xss-Protection,Cookie,X-Frame-Options,Cache-control,Pragma

zuul.host.socket-timeout-millis=300000

zuul.host.connect-timeout-millis=5000

Single Sign-On configuration parameters. Do not change these values, unless you were told so by Talend.
logging.file=data/logs/app.log Path of the log file storage folder
logging.pattern.level=%5p [user %X{user}] Level output pattern for the log file
logging.pattern.file=%d{yyyy-MM-dd HH:mm:ss.SSS} %5p --- [%t] %-40.40logger{39} : %m%n%wEx Uncomment this parameter to enable log pattern configuration
logging.level=WARN

logging.level.org.talend.dataprep=INFO

logging.level.org.talend.dataprep.api=INFO

logging.level.org.talend.dataprep.dataset=INFO

logging.level.org.talend.dataprep.preparation=INFO

logging.level.org.talend.dataprep.transformation=INFO

logging.level.org.talend.dataprep.fullrun=INFO

logging.level.org.talend.dataprep.api.dataquality=INFO

logging.level.org.talend.dataprep.configuration=INFO

logging.level.org.talend.dataquality.semantic=INFO

Talend Data Preparation loggers parameters
logging.pattern.console=%clr(%d{yyyy-MM-dd HH:mm:ss.SSS}){faint} %clr(%5p) %clr(${PID:- }){magenta} %clr(---){faint} %clr([%15.15t]){faint} %clr(%-40.40logger{39}){cyan} %clr(:){faint} %m%n%wEx Uncomment this parameter to enable console logging pattern configuration
spring.output.ansi.enabled=always Uncomment this parameter to configure ansi coloration in console output
logging.config=logback.xml Uncomment this parameter to configure the Talend Data Preparation logging with a custom logback file.

Enter the path to your logback file

audit.log.enabled=true

talend.logging.audit.config=config/audit.properties

Audit logs parameters
default.text.enclosure="

default.text.escape="

default.text.encoding=UTF-8

Configurable values for the default enclosure and escape characters for CSV exports, as well as the default text encoding. The default separator can be semicolon ";", tab "\t", space " ", comma "," or pipe "|"
default.import.text.enclosure="

default.import.text.escape=

Configurable values for the default enclosure and escape characters for CSV imports
app.products[0].id=TDS

app.products[0].name=Data Stewardship

app.products[0].url=<place_your_tds_url_here>

When Talend Data Preparation and Talend Data Stewardship are both installed, you have the possibility to switch between the two applications. Configure the URL to your Talend Data Stewardship instance so that you can reach it.