Talend Data Stewardship Cloud
Talend Data Stewardship Cloud engages your data workers to partner up and collaborate on turning data into highly shared, curated and ready to use assets. It empowers anyone to collaborate to data curation, arbitration or validation within their field of responsibility, through an intuitive, workflow-based and outcome-driven experience.
Feature | Description |
---|---|
Curate and certify data | Improve productivity and get guidance in your data certification and
task curation.
|
Collaborate for trusted data | Orchestrate your data stewardship activities in campaigns and delegate
tasks to the people that know the data best.
|
Integrate data stewardship | Embed governance and stewardship in your data management efforts.
|
Audit and track data error resolution actions | Comply with your data governance policies and measures the outcomes of
your data stewardship efforts.
|
Service Level Agreement (SLA) on task resolution | Define due dates at campaign level and also at task level, i.e. from the studio. |
Promote campaigns and data models across environments | Work more easily with several Talend Data Stewardship environments which
belong to identical Talend product versions. Export campaigns or data models from a source environment, and then import them back to the target environment. |
Copy campaigns and data models in Talend Data Stewardship | Copy a campaign or a data model in your current instance and modify any of its metadata or parameter values to create a close copy without having to recreate the campaign or data model from scratch. |
Functions on data in columns | Several functions have been introduced to improve the enrichment and cleansing possibilities. |
Known issues: https://jira.talendforge.org/issues/?filter=27113
- Short Cache (SC) which contains user's entitlements: TTL 2mn,
- Long Cache (LC) which contains user's first name/last name: TTL 15mn.
Issue |
Workaround |
---|---|
Log in with a freshly created user account: When the account manager creates a new Data Stewardship user using Talend Cloud Management Console, this newly created user account can get an "Access Denied" error message if used straight away. |
Wait for SC TTL expiration (which can take up to two minutes) before connecting and accessing Data Stewardship. |
Use a freshly created user account in campaign definition: When the account manager creates a new Data Stewardship user using Talend Cloud Management Console, this new user account cannot be used straight away in the campaign definition even if it is listed in the "Campaign Owners" / "Stewards" drop down lists. If campaign owners try to use the newly created account in
the campaign definition before LC TTL expiration, they won't be able to
save the campaign and will get one of the following error messages:
|
The campaign owner should wait for LC TTL expiration (which can take up to 15 minutes) before using the newly created user account in the campaign definition. |
Login to Data Stewardship between SC TTL and LC TTL expiration: If users log in to Data Stewardship for the first time between SC TTL expiration and LC TTL expiration, their first name/last names won't be properly displayed. "Deleted user <UUID>" will be displayed instead of user's "<first name> <last name>". The mainly impacted
pages in Data Stewardship are:
This display issue is temporary. It lasts at most for 15 minutes. The user's "<first name> <last name>" will be properly displayed once both caches, SC and LC, are renewed. |
The campaign owner should wait for LC TTL expiration (which can take up to 15 minutes) before using the newly created user account in the campaign definition. |
Get started with Talend Data Stewardship Cloud on this page.
Talend Data Preparation
Feature | Description |
---|---|
Dynamic selection of a preparation in a Job | When using the tDataprepRun component to operationalize your preparations in Talend Studio, you can select the Dynamic preparation selection check box to define a preparation via its path in the application, rather than its technical id. This allows you to dynamically select a preparation at runtime, depending on the input data for example. |
New parameters for CSV datasets | When working with data from local CSV files, you can now configure the escape and text enclosure characters, as well the encoding, at import and export time. |
Snowflake connectivity | Talend Data Preparation now offers direct connectivity to data stored in Snowflake databases in order to create datasets. |
Filter data outside of the sample | From each individual column, you have the possibility to create a filter on empty, invalid, or valid data, even if there is no matching value in the sample, in order to fetch more rows. |
Column name used in data discovery | In order to identify the semantic type of your columns with more accuracy, the name of the column is now taken into account during the data discovery process. |
New functions | Several new functions have been added to improve the enrichment and
cleansing possibilities:
Some of the existing functions have been improved with new parameters or a new behavior:
Moreover, some functions can be applied on the whole table:
Finally, you can now decide if you want to output the effects of certain functions in a new column or in place by selecting the Create new column check box. |
Known issues: https://jira.talendforge.org/issues/?filter=26475
Issue |
Workaround |
---|---|
Due to a compatibility issue between the Talend Studio and Talend Data Preparation libraries, the
tDataprepRun component does not work in a Spark Batch
or Streaming Job. See issue https://jira.talendforge.org/browse/TDP-5244. |
In order to work with Talend Data Preparation Cloud, you can either:
As a side effect, patching your Talend Studio to make it compatible with Cloud will make it incompatible with the on-premises version of Talend Data Preparation. Consequently, if you want to create Spark Jobs that are compatible with both the Cloud and the on-premises versions of Talend Data Preparation, you will actually need to work with two versions of Talend Studio: one with the patch to work on the Cloud, and one without the patch to work on-premises. |
Get started with Talend Data Preparation Cloud on this page.
Talend Integration Cloud
Feature | Description |
---|---|
New public API version | With the Talend Integration Cloud Public API v1.1, it is possible to use <parmname> and <parameter_parmname> formats when setting context parameters in Studio. The values of the context parameters do not change to <parameter_parmname> during execution. |
Talend Exchange | As Integration Actions are no longer supported on Talend Integration Cloud, Talend Exchange has
been removed from the following modules:
|
Talend Cloud components | The tActionReject component has been fully reinstated. The tActionLog and tActionFailure components have been renamed to tJobLog and tJobFailure. |
Flow configuration | Advanced context parameters are visible and configurable in the Flow Configuration panel in the Flow Builder. |
Known issues: https://jira.talendforge.org/issues/?filter=26511
Get started with Talend Integration Cloud on this page.
Talend Cloud Management Console
Feature | Description |
---|---|
Talend Cloud Management Console roles | The rights of the previous Administrator role have been devieded
between two roles: Project Administrator and Security Administrator. The
Project Administrator can manage:
The Security Administrator can manage:
|
Talend Data Stewardship roles | With the introduction of Talend Data Stewardship on Cloud, the
following roles can be assigned to users:
|
Known issues: https://jira.talendforge.org/issues/?filter=28553
Get started with Talend Cloud Management Console Cloud on this page.
Talend Studio
Talend Cloud Winter '18 ships with the 6.5.1 release of Talend Real-Time Big Data Platform. For the list of new features, see Talend Real-Time Big Data Platform Release Notes.