List of configuration parameters for Talend Data Preparation - 8.0

Talend Data Preparation User Guide

Version
8.0
Language
English (United States)
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Data Preparation
Content
Data Quality and Preparation > Cleansing data

All parameters in the application.properties file are set by default during the installation of Talend Data Preparation by Talend Installer. However, you can customize them according to your installation environment.

For further information about installing and configuring Talend Data Preparation, see Talend installation guides.

Parameter Description
dataprep.locale Setting for the application interface language.
public.ip=<place_your_public_ip_here>

server.port=9999

management.server.port=9999

async-runtime.contextPath=/api

IP of the server hosting Talend Data Preparation and server port.
server.compression.enabled=true

server.compression.mime-types=text/plain,text/html,text/css,application/json,application/x-javascript,text/xml,application/xml,application/xml+rss,text/javascript,application/javascript,text/x-js

Response compression parameters.
iam.ip=<place_your_iam_ip_here>

iam.uri=http://${iam.ip}:9080

iam.api.uri=${iam.uri}

IP of the server hosting Talend Identity and Access Management, used for SSO, and server port.
spring.mvc.async.request-timeout=600000 Timeout setting for asynchronous executions. Do not change this value, unless you were told so by Talend.
dataprep.event.listener=spring

spring.service.name=TDP

Event propagation parameter. Can be spring or Kafka.
live.dataset.location=tac

live.dataset.url=<place_live_dataset_provider_url_here>

live.dataset.task-prefix=dataprep_

Parameters related to the live dataset feature.
Warning: Starting with the 8.0 version, the live dataset feature is not available anymore.
spring.data.mongodb.host=localhost

spring.data.mongodb.port=27017

spring.data.mongodb.database=dataprep

spring.data.mongodb.username=dataprep-user

spring.data.mongodb.password=<randomly_generated_password>

multi-tenancy.mongodb.active=false

MongoDB settings.
spring.data.mongodb.uri= For more complex use cases, mongo.* configurations can be overwritten by specifying a URI directly.
mongo.index.tdp_dataSetMetadata.name=dataSetMetadata.name

mongo.index.tdp_dataSetMetadata.lifecycle-importing.creation-date=creation-date

mongo.index.tdp_acl.resource-id-type=id-type

mongo.index.tdp_acl.resource-type.owner-id.controls-type-who=type-who

mongo.index.tdp_folder.owner-id.path=id.path

mongo.index.tdp_folder.parent-id=parent-id

mongo.index.tdp_folderEntry.folder-id.content-type=content-type

mongo.index.tdp_folderEntry.content-id.content-type=content-type2

mongo.index.tdp_identifiable.class.content-id=content-id

mongo.index.tdp_fullrun.group=group

Mongo index names. Indicate new index name to get a smaller name than the generated default one.
mongodb.ssl=true

mongodb.ssl.trust-store=/path/to/trust-store.jks

mongodb.ssl.trust-store-password=trust-store-password

Uncomment these parameters to set up a secured connexion with MongoDB.
tls.key-store=/path/to/key-store.jks

tls.key-store-password=key-store_password

tls.trust-store=/path/to/trust-store.jks

tls.trust-store-password=trust-store_password

tls.verify-hostname=false

Uncomment these parameters to set up a secured https connexion for Talend Data Preparation.
security.provider=oauth2

security.token.secret=top-secret

Authentication parameters.
talend.security.token.value=<CHANGE_IT> Token to access actuator data.
spring.profiles.active=server-standalone

spring.mvc.favicon.enabled=false

spring.servlet.multipart.max-file-size=200000000

spring.servlet.multipart.max-request-size=200000000

Spring parameters. Do not change these values, unless you were told so by Talend.
service.documentation=true

service.documentation.name=Talend Data Preparation - API

service.documentation.description=This service exposes high level services that may involve services orchestration.

service.paths=api

springfox.documentation.swagger.v2.host=${public.ip}:${server.port}${gateway-api.service.path}

springfox.resources.prefix.url=${gateway-api.service.path}

logging.level.io.swagger.models.parameters.AbstractSerializableParameter=error

Set these parameters to enable access to Swagger.
dataset.records.limit=10000

dataset.local.file.size.limit=2000000000

dataset.imports=local,job,tcomp-JDBCDatastore,tcomp-SimpleFileIoDatastore,tcomp-SalesforceDatastore,tcomp-S3Datastore,tcomp-AzureDlsGen2BlobDatastore

dataset.list.limit=10

Size limit and display parameters for your datasets.
api.service.url=http://${public.ip}:${server.port}

dataset.service.url=http://${public.ip}:${server.port}

tdc.dataset.url=http://${public.ip}:${server.port}

dataset-dispatcher.service.url=http://${public.ip}:${server.port}

transformation.service.url=http://${public.ip}:${server.port}

preparation.service.url=http://${public.ip}:${server.port}

fullrun.service.url=http://${public.ip}:${server.port}

gateway.service.url=http://${public.ip}:${server.port}

tdc.sharing.url=http://${public.ip}:${server.port}

tdc.rating.url=http://${public.ip}:${server.port}

Address of the dataset service.
dataset.metadata.store=mongodb

preparation.store=mongodb

user.data.store=mongodb

folder.store=mongodb

upgrade.store=mongodb

File storage service configuration parameters. Do not change these values, unless you were told so by Talend.
content-service.store=local

content-service.store.local.path=data/

content-service.journalized=true

Location for cache and content storage.
preparation.store.remove.hours=24 Preparation service configuration. Do not change these values, unless you were told so by Talend.
lock.preparation.store=mongodb

lock.preparation.delay=600

Lock duration parameter in seconds, when working on shared preparations.
luceneIndexStrategy=singleton Lucene index configuration. Do not change these values, unless you were told so by Talend.
execution.store=mongodb

async.operation.concurrent.run=5

Parameters for asynchronous full run and sampling operations, namely storage and number of allowed concurrent runs. Do not change the mongodb value, unless you were told so by Talend. Regarding asynchronous operations, if there are more full run operations than the parameter's value running in parallel, the operations will be queued, and will resume when there is an available slot. You can increase the value of this parameter, according to your machine's power.
tcomp.server.url=http://<place_tcomp_ip_here>:8989/tcomp URL of the server hosting the Components Catalog, used to configure self service connectors.
tcomp-SimpleFileIoDatastore.kerberosPrincipal.default=${streams.kerberos.principal}

tcomp-SimpleFileIoDatastore.kerberosKeytab.default=${streams.kerberos.keytab_path}

tcomp-SimpleFileIoDataset.path.default=${streams.hdfs.server.url}

Components Catalog configuration properties. Allows you to set your kerberos configuration automatically when importing datasets from HDFS.
tcomp-SimpleFileIoDatastore.test_connection.visible=false Parameter to remove the test connection step from the Talend component form. Do not change this parameter, unless you were told so by Talend.
async.operation.watcher.ttl=3600000 Maximum execution time for full runs, in milliseconds.
receivers.timeout=3600000 Maximum waiting time for live datasets input.
Warning: Starting with the 8.0 version, the live dataset feature is not available anymore.
dataquality.semantic.list.enable=false

dataquality.server.url=<place_data-quality_server_url_here>

Parameters to activate the semantic type edition in the Talend Data Preparation interface, and URL of the server hosting Talend Dictionary Service.
tsd.enabled=false

tsd.consumer.semantic-topic-content=raw

tsd.maven.connector.s3Repository.bucket-url=<place_minio_bucket-url_here>

tsd.maven.connector.s3Repository.base-path=<place_minio_base-path_here>

tsd.maven.connector.s3Repository.username=<place_minio_username_here>

tsd.maven.connector.s3Repository.password=<place_minio_password_here>

tsd.maven.connector.s3Repository.s3.region=<place_minio_region>

tsd.maven.connector.s3Repository.s3.endpoint=<place_minio_server_url_here>

tsd.dictionary-provider-facade.producer-url=<place_data-quality_server_url_here>

tsd.dictionary-provider.index-folder=tsd-index

tsd.dictionary-provider-facade.producer-url=${semanticservice.url}

dataquality.event.store=mongodb

spring.cloud.stream.kafka.binder.brokers=<place_kafka_ip_here>

schema.kafka.topics.prefix=

Parameters that must match you MinIO or S3 repository configuration in order to use the default and custom semantic types.
tsd.maven.connector.temporaryFolder=${catalina.base}/data: Directory to download and extract Lucene indexes from MinIO.

tsd.dictionary-provider.index-folder=${catalina.base}/data: Directory to store the Lucene indexes.

Parameters to define the folders used to extract the semantic types.
streams.enable=false

streams.flow.runner.url=http://<local machine ip>:<Big data preparation port>

streams.kerberos.principal=<principal>

streams.kerberos.keytab_path=<keytab path>

streams.hdfs.server.url=hdfs://<host>:<port>/<filepath>

Streams Runner configuration parameters.

Enable these parameters to configure Talend Data Preparation with Big Data.

Warning: Starting with the 8.0 version, exporting the result of your preparation to HDFS is not possible anymore.
security.basic.enabled=false

security.oidc.client.expectedIssuer=${iam.uri}/oidc

iam.license.url=${iam.uri}/oidc/api

security.oidc.client.keyUri=${iam.uri}/oidc/jwk/keys

security.oauth2.client.clientId=<security client id>

security.oauth2.client.clientSecret=<security client secret>

security.oidc.client.claimIssueAtTolerance=120

security.oauth2.resource.serviceId=${PREFIX:}resource

security.oauth2.resource.tokenInfoUri=${iam.uri}/oidc/oauth2/introspect

security.oauth2.resource.uri=/v2/api-docs,/api/**,/folders/**,/datasets/**,/dataset/**,/preparations/**,/transform/**,/version/**,/acl/**,/apply/**,/export,/export/**,/aggregate,/sampling/**,/receivers/**,/error,/datastores/**,/preparation/**,/actions/**,/suggest/**,/dictionary/**,/transformation/preparations/**,/transformation/v2/**,/sharing/**

security.oauth2.resource.filter-order=3

security.scim.enabled=true

security.oauth2.client.access-token-uri=${iam.uri}/oidc/oauth2/token

security.oauth2.client.scope=openid refreshToken

security.oauth2.client.user-authorization-uri=${iam.uri}/oidc/idp/authorize

security.oauth2.sso.login-use-forward=false

server.servlet.session.cookie.name=TDPSESSION

spring.session.store-type=NONE

security.sessions=stateless

security.user.password=none

Single Sign-On security configuration parameters.
security.oidc.client.endSessionEndpoint=${iam.uri}/oidc/idp/logout

security.oidc.client.logoutSuccessUrl=http://${public.ip}:${server.port}

security.oauth2.logout.uri=/signOut

security.oauth2.sso.login-path=/signIn

iam.scim.url=${iam.api.uri}/scim/

security.oauth2.resource.tokenInfoUriCache.enabled=true

tenant.account.cache.enabled=true

Single Sign-On properties for Talend Data Preparation API and Gateway.
gateway-api.service.url=http://${public.ip}:${server.port}

gateway-api.service.path=/gateway

zuul.servletPath=/gateway/upload

zuul.routes.dq.path=/gateway/dq/semanticservice/**

zuul.routes.dq.sensitiveHeaders=${zuul.sensitiveHeaders}

zuul.routes.dq.url=${dataquality.server.url}/

proxy.auth.routes.dq=oauth2

zuul.routes.me.path=/gateway/api/v1/scim/me/**

zuul.routes.me.url=${iam.scim.url}/Me

proxy.auth.routes.me=oauth2

zuul.routes.pendo.path=/gateway/api/iam-server/pendo/**

zuul.routes.pendo.url=${iam.scim.url}/pendo

proxy.auth.routes.pendo=oauth2

zuul.routes.sharingset.path=/gateway/api/v1/sharingset/**

zuul.routes.sharingset.sensitiveHeaders=${zuul.sensitiveHeaders}

zuul.routes.sharingset.url=http://${public.ip}:${server.port}/sharing/v1/sharingset

proxy.auth.sharingset.api=oauth2

zuul.routes.sharing.path=/gateway/api/v1/sharing/**

zuul.routes.sharing.sensitiveHeaders=${zuul.sensitiveHeaders}

zuul.routes.sharing.url=http://${public.ip}:${server.port}/sharing/v1/sharing

proxy.auth.sharing.api=oauth2

zuul.routes.sharings.path=/gateway/api/v1/sharings/**

zuul.routes.sharings.sensitiveHeaders=${zuul.sensitiveHeaders}

zuul.routes.sharings.url=http://${public.ip}:${server.port}/sharing/v1/sharings

proxy.auth.sharings.api=oauth2

zuul.routes.api.path=/gateway/api/**

zuul.routes.api.sensitiveHeaders=${zuul.sensitiveHeaders}

zuul.routes.api.url=http://${public.ip}:${server.port}/api

proxy.auth.routes.api=oauth2

zuul.routes.swagger1.path=/gateway/v2/api-docs/**

zuul.routes.swagger1.sensitiveHeaders=${zuul.sensitiveHeaders}

zuul.routes.swagger1.url=http://${public.ip}:${server.port}/v2/api-docs

proxy.auth.swagger1.api=oauth2

zuul.ignoredPatterns=/login,/logout,/signOut,/signIn

zuul.sensitiveHeaders=Cookie,Set-Cookie

zuul.host.socket-timeout-millis=300000

zuul.host.connect-timeout-millis=5000

Single Sign-On configuration parameters. Do not change these values, unless you were told so by Talend.
logging.file=data/logs/app.log Path of the log file storage folder.
logging.pattern.level=%5p [user %X{userId}] Level output pattern for the log file.
logging.pattern.file=%d{yyyy-MM-dd HH:mm:ss.SSS} %5p --- [%t] %-40.40logger{39} : %m%n%wEx Uncomment this parameter to enable log pattern configuration.
logging.level.=WARN

logging.level.org.talend.dataprep=INFO

logging.level.org.talend.dataprep.api=INFO

logging.level.org.talend.dataprep.dataset=INFO

logging.level.org.talend.dataprep.preparation=INFO

logging.level.org.talend.dataprep.transformation=INFO

logging.level.org.talend.dataprep.fullrun=INFO

logging.level.org.talend.dataprep.api.dataquality=INFO

logging.level.org.talend.dataprep.configuration=INFO

logging.level.org.talend.dataquality.semantic=INFO

Talend Data Preparation loggers parameters.
logging.pattern.console=%clr(%d{yyyy-MM-dd HH:mm:ss.SSS}){faint} %clr(%5p) %clr(${PID:- }){magenta} %clr(---){faint} %clr([%15.15t]){faint} %clr(%-40.40logger{39}){cyan} %clr(:){faint} %m%n%wEx Uncomment this parameter to enable console logging pattern configuration.
spring.output.ansi.enabled=always Uncomment this parameter to configure ansi coloration in console output. The possible values are always, never, and detect.
logging.config=classpath:logback-spring.xml Uncomment this parameter to configure the Talend Data Preparation logging with a custom logback file.

Enter the path to your logback file.

audit.log.enabled=true

talend.logging.audit.config=config/audit.properties

Audit logs parameters.
help.url=https://help.talend.com

help.version=content

help.facets.version=8.0

help.facets.language=en

help.search.url=https://www.talendforge.org/find/api/THC.php

help.fuzzy.url=${help.url}/search/all?filters=EnrichPlatform~%2522Talend+Data+Preparation%2522*EnrichVersion~%2522${help.facets.version}%2522&utm_medium=dpdesktop&utm_source=

help.exact.url=${help.url}/access/sources/${help.version}/topic?EnrichPlatform=Talend+Data+Preparation&EnrichVersion=${help.facets.version}&utm_medium=dpdesktop

support.url=https://www.talend.com/services/technical-support/

community.url=https://community.talend.com/t5/Data-Quality-and-Preparation/bd-p/prepare_govern

Parameters that allow access to the online documentation.

Do not change these values, unless you were told so by Talend.

default.text.enclosure=\"

default.text.escape=\"

default.text.encoding=UTF-8

Configurable values for the default enclosure and escape characters for CSV exports, as well as the default text encoding. The default separator can be semicolon ";", tab "\t", space " ", comma "," or pipe "|".
default.import.text.enclosure=\"

default.import.text.escape=

Configurable values for the default enclosure and escape characters for CSV imports.
app.products[0].id=TDS

app.products[0].name=Data Stewardship

app.products[0].url=<place_your_tds_url_here>

When Talend Data Preparation and Talend Data Stewardship are both installed, you have the possibility to switch between the two applications. Configure the URL to your Talend Data Stewardship instance so that you can reach it.
dataset.service.provider=legacy

management.health.redis.enabled=false

management.endpoint.prometheus.enabled=false

Technical parameters. Do not change these values, unless you were told so by Talend.
maintenance.scheduled.cron = 0 0 3 * * *

maintenance.scheduled.fixed-delay = 3600000

maintenance.scheduled.initial-delay = 3600000

spring.batch.datasource=mongodb

spring.batch.job.enabled=false

server.servlet.session.timeout=60m

oidc.accessTokenLifetime=3600

features.lineage.enabled = false

features.inventory.events.enabled=false

dataprep.actions.exclude=detect_outliers

dataprep.actions.useDeprecated=true

features.items.sharing.key=tdp.sharing.impl

features.items.sharing.modes.legacy=legacy

features.items.sharing.defaultMode=legacy

features.items.sharing.provider=fixed

features.statistics.enabled=true

Maintenance parameters. Do not change these values, unless you were told so by Talend.
iam.uri=https://iam.us.cloud.talend.com

iam.api.uri=https://api.us.cloud.talend.com/v1

iamproxy.service.url=${iam.api.uri}/iam

server.portal.url=https://portal.us.cloud.talend.com

external.user.preferences.url=${server.portal.url}/user/aboutme

security.oauth2.resource.jwt.key-uri=${iam.uri}/oidc/jwk/keys

security.oidc.client.sessionManagementUri=${iam.uri}/oidc/session-management

security.oauth2.client.scope=openid refreshToken entitlements

Uncomment these parameters to configure hybrid mode for Talend Data Preparation after enabling it in the Talend Management Console interface.