Designing Jobs

Talend Software Development Life Cycle Best Practices Guide

author
Talend Documentation Team
EnrichVersion
6.5
EnrichProdName
Talend Big Data
Talend MDM Platform
Talend Data Integration
Talend Data Fabric
Talend Data Services Platform
Talend ESB
Talend Data Management Platform
task
Design and Development
Deployment
Administration and Monitoring
EnrichPlatform
Talend Repository Manager
Talend Studio
Talend Administration Center
Talend JobServer
Talend CommandLine
Talend Artifact Repository
At this stage, the conceptualization part is done and each team has been assigned some tasks. The development team designs Jobs in the Talend Studio, which are the development unit in Talend. Jobs allow you to put in place up and running dataflow management processes.
Best practices: To ensure continuous integration during development and to help developers design and build consistent, efficient and optimised Jobs, here are some best practices we recommend you to follow:

Concept

Best practice example

Naming standards

In the Studio, define a naming convention for Jobs and folders and follow it.

In this document, the naming convention is the following, but feel free to adapt it to your requirements: job prefix for Job names, test prefix for Test Case names, pub prefix for publishing task names and task prefix for execution task names.

For example, name your folder xxx. Folders should be used to group Jobs of a similar type. Then create a Job named job_xxx_description and its Test case named test_xxx_description.

At a more granular level, components should also have a meaningful name.

Version control

Use SVN/GIT branches and tags as well as the Studio to handle Job versions.

For more information on how to change the version of your Jobs centrally at once to publish them with the version of your choice, see How to change the deployment version of each Job or Route at once.

Metadata

Use schema metadata in your Jobs to share database connections between several Jobs and help designing source/target components.

Contexts

Use contexts in order to reuse variables (context parameters locally for Jobs, group contexts globally for projects) such as database connectivity, host names, ports, etc. If values need to be changed or are used in multiple places, then they should not be hard coded and it is recommended to use contexts.

These contexts are also useful to switch between environments (Development context then QA context then Production context).

Standard Job layout

Use a standard Job layout to ensure its readability, it is particularly useful for collaborative work.

Some examples include: putting data flows from left to right, top-to-bottom layout to show the process flow between subJobs, target components on the right, etc.

Complexity

Jobs should follow a logic and be split in steps, called subJobs, when necessary. It is also recommended to use parent Jobs to run one or several child Jobs in order to create a process flow and even though there is no limit, you should avoid using more than 20 components in a Job.

Once the Job is designed in a remote project from the Studio or the CommandLine (via the exportJob command), it can be published, deployed and executed in Talend Administration Center. Exporting the Job as an artifact will also help to perform Quality assurance tests on the same exact Jobs than those created in the Development environment. For more information, see Continuous Integration: Deploying to QA and Production environments.