List of configuration parameters for Talend Data Preparation - 2.1

Talend Data Preparation User Guide

author
Talend Documentation Team
EnrichVersion
6.4
2.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
task
Data Quality and Preparation > Cleansing data
EnrichPlatform
Talend Data Preparation

All parameters in the application.properties file are set by default during the installation of Talend Data Preparation by Talend Installer. However, you can customize them according to your installation environment.

For further information about installing and configuring Talend Data Preparation, see Talend installation guides.

Parameter

Description

tac.url=http://<local machine ip>:8080/org.talend.administrator/

URL to your Talend Administration Center instance, used for user, licence and right management.

public.ip=<local machine ip>

server.port=9999

IP of the server hosting Talend Data Preparation and server port

iam.ip=<local machine ip>

IP of the server hosting Talend Identity and Access Management, used for SSO

spring.mvc.async.request-timeout=300000

Timeout setting for Asynchronous executions. Do not change this value, unless you were told so by Talend.

tac.task-prefix=dataprep_

Prefix used to list Talend Administration Center tasks in the Talend Data Preparation interface and create Live datasets. Only the tasks with this prefix will be listed when importing data with the from Talend Job option.

tac.user-name=security@company.com

tac.password=<security@company.com>

Username and password for your Talend Administration Center administrator account. This user will be used to list tasks when creating Live datasets.

mongodb.host=<local machine ip>

mongodb.port=27017

mongodb.database=dataprep

mongodb.user=dataprep-user

mongodb.password=<randomly generated password>

multi-tenancy.mongodb.active=true

MongoDB settings

mongodb.ssl=true

mongodb.ssl.trust-store=/path/to/trust-store.jks

mongodb.ssl.trust-store-password=trust-store-password

Uncomment these parameters to set up a secured connexion with MongoDB

tls.key-store=/path/to/key-store.jks

tls.key-store-password=key-store_password

tls.trust-store=/path/to/trust-store.jks

tls.trust-store-password=trust-store_password

tls.verify-hostname=false

Uncomment these parameters to set up a secured https connexion for Talend Data Preparation

security.provider=oauth2

security.token.secret=encrypted password

security.token.renew-after=30

security.token.invalid-after=3600

Authentication parameters

spring.profiles.active=server-standalone

spring.mvc.favicon.enabled=false

Spring parameters. Do not change these values, unless you were told so by Talend.

service.documentation=false

service.documentation.name=Talend Data Preparation - API

service.documentation.description=This service exposes high level services that may involve services orchestration.

service.paths=api

Set these parameters to enable access to Swagger

dataset.records.limit=10000

dataset.local.file.size.limit=2000000000

dataset.imports=local,job,tcomp-JDBCDatastore,tcomp-SimpleFileIoDatastore,tcomp-SalesforceDatastore,tcomp-S3Datastore

dataset.list.limit=10

Size limit and display parameters for your datasets

dataset.service.url=http://${public.ip}:${server.port}

transformation.service.url=http://${public.ip}:${server.port}

preparation.service.url=http://${public.ip}:${server.port}

fullrun.service.url=http://${public.ip}:${server.port}

Address of the dataset service

dataset.metadata.store=mongodb

preparation.store=mongodb

user.data.store=mongodb

folder.store=mongodb

upgrade.store=mongodb

File storage service configuration parameters. Do not change these values, unless you were told so by Talend.

content-service.store=local

content-service.store.local.path=data/

Location for cache and content storage

preparation.store.remove.hours=24

Preparation service configuration. Do not change these values, unless you were told so by Talend.

lock.preparation.store=mongodb

lock.preparation.delay=600

Lock duration parameter in seconds, when working on shared preparations

hazelcast.enabled=true

Enable or disable Hazelcast. Do not change these values, unless you were told so by Talend.

luceneIndexStrategy=singleton

Lucene index configuration. Do not change these values, unless you were told so by Talend.

execution.store=mongodb

async.operation.concurrent.run=5

Parameters for asynchronous full run and sampling operations, namely storage and number of allowed concurrent runs. Do not change the mongodb value, unless you were told so by Talend. Regarding asynchronous operations, if there are more full run operations than the parameter's value running in parallel, the operations will be queued, and will resume when there is an available slot. You can increase the value of this parameter, according to your machine's power.

tcomp.server.url=http://<local machine ip>:8989/tcomp

URL of the server hosting the Components Catalog, used to configure self service connectors

tcomp-JDBCDataset.sourceType.hide=true

tcomp-JDBCDatastore.password.hide=true

Components Catalog configuration properties. Allows you to hide specific fields in the database datasets import form.

tcomp-SimpleFileIoDatastore.kerberosPrincipal.default=${streams.kerberos.principal}

tcomp-SimpleFileIoDatastore.kerberosKeytab.default=${streams.kerberos.keytab_path}

tcomp-SimpleFileIoDataset.path.default=${streams.hdfs.server.url}

Components Catalog configuration properties. Allows you to set your kerberos configuration automatically when importing datasets from HDFS.

tcomp-SimpleFileIoDatastore.test_connection.visible=false

Parameter to remove the test connection step from the Talend component form. Do not change this parameter, unless you were told so by Talend

async.operation.watcher.ttl=3600000

Maximum execution time for full runs, in milliseconds

receivers.timeout=3600000

Maximum waiting time for Live datasets input

dataquality.indexes.file.location=data/data-quality/org.talend.dataquality.semantic

Data quality indexes storage location. If you change this value, Talend Data Preparation will automatically recreate the indexes at startup, but only for the default ones. In order to retrieve your custom semantic types, you need to copy the content of your old directory, and paste it to the new location.

dataquality.semantic.list.enable=true

dataquality.server.url=http://<local machine ip>:8187/

Parameter to activate the semantic type edition in the Talend Data Preparation interface, and url of the server hosting Talend Dictionary Service

dataquality.semantic.update.enable=true

dataquality.event.store=mongodb

spring.cloud.stream.kafka.binder.brokers=<local machine ip>

spring.cloud.stream.kafka.binder.zkNodes=<local machine ip>

spring.cloud.stream.kafka.binder.defaultBrokerPort=9092

spring.cloud.stream.kafka.binder.defaultZkPort=2181

spring.cloud.stream.bindings.input.destination=${MESSAGING_DOCUMENT_QUEUE:dictionary}

spring.cloud.stream.bindings.input.content-type=application/x-java-object;type=org.talend.dataquality.semantic.model.DQDocumentAction

spring.cloud.stream.bindings.input.group=${MESSAGING_CATEGORY_GROUP:dictionaryGroup}

spring.cloud.stream.bindings.category.destination=${MESSAGING_CATEGORY_QUEUE:category}

spring.cloud.stream.bindings.category.content-type=application/x-java-object;type=org.talend.dataquality.semantic.model.DQCategoryAction

spring.cloud.stream.bindings.category.group=${MESSAGING_REGEX_GROUP:dictionaryGroup}

spring.cloud.stream.bindings.regEx.destination=${MESSAGING_REGEX_QUEUE:regex}

spring.cloud.stream.bindings.regEx.content-type=application/x-java-object;type=org.talend.dataquality.semantic.model.DQCategoryAction

spring.cloud.stream.bindings.regEx.group=${MESSAGING_REGEX_GROUP:dictionaryGroup}

data.management.lucene.documents.folder=${dataquality.indexes.file.location}/index/dictionary

data.management.lucene.categories.folder=${dataquality.indexes.file.location}/category

data.management.receiving.folder=${dataquality.indexes.file.location}/index/received/

data.management.regex.folder=${dataquality.indexes.file.location}/regex

Data quality updates parameters

streams.enable=false

streams.flow.runner.url=http://<Streams Runner ip>:<Streams Runner port>/streams.run/v1

streams.kerberos.principal=<principal>

streams.kerberos.keytab_path=<keytab path>

streams.hdfs.server.url=hdfs://<host>:<port>/<filepath>

Streams Runner configuration parameters

Enable these parameters to configure Talend Data Preparation with Big Data

security.basic.enabled=false

security.oidc.client.expectedIssuer=accounts.talend.com

iam.license.url=http://$%7Biam.ip%7D:9080/oidc/services

security.oidc.client.keyUri=http://$%7Biam.ip%7D:9080/oidc/jwk/keys

security.oauth2.client.clientId=<randomly generated Id>

security.oauth2.client.clientSecret=<encrypted password>

security.oidc.client.claimIssueAtTolerance=120

security.oauth2.resource.serviceId=${PREFIX:}resource

security.oauth2.resource.tokenInfoUri=http://$%7Biam.ip%7D:9080/oidc/oauth2/introspect

security.oauth2.resource.uri=/api/**,/folders/**,/datasets/**,/preparations/**,/transform/**,/version/**,/acl/**,/apply/**,/export,/export/**,/aggregate,/sampling/**,/receivers/**,/error,/docs,/datastores/**,/preparation/**

security.oauth2.resource.filter-order=3

security.oauth2.resource.tokenInfoUriCache.enabled=true

security.scim.cache.enabled=true

security.scim.enabled=true

security.oauth2.client.access-token-uri=http://$%7Biam.ip%7D:9080/oidc/oauth2/token

security.oauth2.client.scope=openid refreshToken

security.oauth2.client.user-authorization-uri=http://$%7Biam.ip%7D:9080/oidc/idp/authorize?prompt=none

security.oauth2.sso.login-use-forward=false

server.session.cookie.name=TDPSESSION

security.sessions=stateless

security.user.password=none

Single Sign-On security configuration parameters

security.oidc.client.endSessionEndpoint=http://$%7Biam.ip%7D:9080/oidc/idp/logout

security.oidc.client.logoutSuccessUrl=http://${public.ip}:${server.port}

security.oauth2.logout.uri=/signOut

security.oauth2.sso.login-path=/signIn

iam.scim.url=http://$%7Biam.ip%7D:9080/scim/

Single Sign-On properties for Talend Data Preparation API and Gateway

gateway-api.service.url=http://${public.ip}:${server.port}

gateway-api.service.path=/gateway

zuul.servletPath=/gateway/upload

zuul.routes.dq.path=/gateway/dq/semanticservice/**

zuul.routes.dq.sensitiveHeaders=${zuul.sensitiveHeaders}

zuul.routes.dq.url=${dataquality.server.url}/

proxy.auth.routes.dq=oauth2

zuul.routes.api.path=/gateway/api/**

zuul.routes.api.sensitiveHeaders=${zuul.sensitiveHeaders}

zuul.routes.api.url=http://${public.ip}:${server.port}/api

proxy.auth.routes.api=oauth2

zuul.sensitiveHeaders=Cookie,Set-Cookie,Expires,X-Content-Type-Options,X-Xss-Protection,Cookie,X-Frame-Options,Cache-control,Pragma

zuul.host.socket-timeout-millis=300000

zuul.host.connect-timeout-millis=5000

Single Sign-On configuration parameters. Do not change these values, unless you were told so by Talend.

logging.file=data/logs/app.log

Path of the log file storage folder

logging.pattern.level=%5p [user %X{user}]

Level output pattern for the log file

logging.pattern.file=%d{yyyy-MM-dd HH:mm:ss.SSS} %5p --- [%t] %-40.40logger{39} : %m%n%wEx

Uncomment this parameter to enable log pattern configuration

logging.level=WARN

logging.level.org.talend.dataprep=INFO

logging.level.org.talend.dataprep.api=INFO

logging.level.org.talend.dataprep.dataset=INFO

logging.level.org.talend.dataprep.preparation=INFO

logging.level.org.talend.dataprep.transformation=INFO

logging.level.org.talend.dataprep.fullrun=INFO

logging.level.org.talend.dataprep.api.dataquality=INFO

logging.level.org.talend.dataprep.configuration=INFO

Talend Data Preparation loggers parameters

logging.pattern.console=%clr(%d{yyyy-MM-dd HH:mm:ss.SSS}){faint} %clr(%5p) %clr(${PID:- }){magenta} %clr(---){faint} %clr([%15.15t]){faint} %clr(%-40.40logger{39}){cyan} %clr(:){faint} %m%n%wEx

Uncomment this parameter to enable console logging pattern configuration

spring.output.ansi.enabled=always

Uncomment this parameter to configure ansi coloration in console output

logging.config=logback.xml

Uncomment this parameter to configure the Talend Data Preparation logging with a custom logback file.

Enter the path to your logback file