What s new in Winter 17

author
Talend Documentation Team
EnrichVersion
6.3
2.1
EnrichProdName
Talend Real-Time Big Data Platform
Talend Big Data
Talend Open Studio for Big Data
Talend MDM Platform
Talend Open Studio for MDM
Talend ESB
Talend Data Fabric
Talend Open Studio for Data Quality
Talend Big Data Platform
Talend Data Services Platform
Talend Data Management Platform
Talend Open Studio for ESB
Talend Open Studio for Data Integration
Talend Data Integration
task
Administration and Monitoring
Deployment
Data Quality and Preparation
Data Governance
Design and Development
Installation and Upgrade
EnrichPlatform
Talend Data Preparation

What s new in Winter 17

This technical note highlights the important new features and capabilities of Talend Winter '17, including our new data stewardship application and of course many new features for data preparation, big data integration, data integration, application integration, master data management, data quality and cloud integration. In addition, this release also sees major version upgrades to many supported databases and Apache Hadoop distributions. Supported features vary between Talend Open Studio and subscription products. Please refer to the http://www.talend.com product pages for more detail.

With Talend Winter '17, the next version of the Talend Data Fabric, we are excited to:

  • Deliver Talend Data Preparation for big data so you can increase big data usage across your company
  • Introduce a new Talend Data Stewardship app to help companies collaboratively improve data integrity and better manage the lifecycle of data
  • Announce important updates to keep you on the cutting edge of big data and cloud technologies
    • New and enhanced Apache Hadoop and NoSQL platforms, including Cloudera 5.8, Hortonworks 2.5, MapR 5.2, Apache Spark 2.0 and Amazon EMR 5.0
    • New and updated components for Amazon SQS (Simple Queue Service), AWS S3 Multipart support, MapR-DB, MapR-Streams, Salesforce (Summer '16), Salesforce Wave (Summer '16), Microsoft SQL Server 16, Bonita, DropBox V2, Apache Camel 2.17.3, Apache Karaf 4.0.6, Apache ActiveMQ 5.14.0, Apache CXF 3.1.7 and Spring Boot 1.3.7.
Data Preparation

Talend Winter '17 empowers information workers with self-service access to the data lake through extensive big data support.

You can now run preparations faster as a self-service, inside your Apache Hadoop cluster, thanks to Apache Spark support, and leverage a preparation in Talend big data jobs.

You can take advantage of self-service big data connectors for CSV, Parquet, and Avro on HDFS (with Kerberos), as well as a JDBC connector to access traditional data sources.

You can define custom semantic types so you can use you own business language to manage your data.

The redesigned user interface also brings numerous productivity gains.

Data Stewardship

More than a tool just for data stewards, the new Talend Data Stewardship app lets IT and data stewards collaborate better on data quality issues and manage data assets better.

You can use it to define the data model, semantics and rules used to cleanse and validate data.

Talend Data Stewardship eases collaboration by defining user roles, workflows and priorities, and delegating tasks to the people who know the data best.

You can improve productivity in your data curation tasks: match and merge your data, resolve data errors, certify and arbitrate.

Governance and stewardship tasks can be embedded into data integration flows, MDM initiatives and matching processes.

You can monitor and audit your stewardship campaigns and data error resolution decisions.

Big Data

Talend Winter '17 increases the scalability of your big data projects while supporting the latest technologies.

Support is provided for Apache Spark 2.0 on EMR, including SparkSQL which runs on data in motion and data at rest. A Technical Preview of Spark Structured Streaming is included to simplify batch and streaming integration.

The Joblet component has been introduced to Apache Spark Jobs in order to factorize recurrent processing or complex transformation steps and ease the reading of a complex Spark Job.

New components for MapR-DB and MapR-Streams expand your integration options.

Integration with Apache Atlas improves data governance of the Apache Hadoop cluster and lets you trace the lineage of any given data to discover how this data was generated by a MapReduce Job or a Spark Batch Job.