Designing Jobs - 6.3

Talend Software Development Life Cycle Best Practices Guide

EnrichVersion
6.3
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
task
Administration and Monitoring
Deployment
Design and Development
EnrichPlatform
Talend Administration Center
Talend Artifact Repository
Talend CommandLine
Talend Repository Manager
Talend Studio

At this stage, the conceptualization part is done and each team has been assigned some tasks.

The development team designs Jobs in the Talend Studio, which are the development unit in Talend. Jobs allow you to put in place up and running dataflow management processes.

Best practices: To ensure continuous integration during development and to help developers design and build consistent, efficient and optimised Jobs, here are some best practices we recommend you to follow:

Concept

Best practice example

Naming standards

In the Studio, define a naming convention for Jobs and folders and follow it.

In this document, the naming convention is the following, but feel free to adapt it to your requirements: "job" prefix for Job names, "test" prefix for Test Case names, pub prefix for publishing task names and "task" prefix for execution task names.

For example, name your folder xxx. Folders should be used to group Jobs of a similar type. Then create a Job named job_xxx_description and its Test case named test_xxx_description.

This Job is then used to create automatically the Test Case named test_xxx.

At a more granular level, components should also have a meaningful name.

Version control

Use SVN or GIT branches and tags to handle Job versions.

Metadata

Use schema metadata in your Jobs to share database connections between several Jobs and help designing source/target components.

Contexts

Use contexts in order to reuse variables (context parameters locally for Jobs, group contexts globally for projects) such as database connectivity, host names, ports, etc. If values need to be changed or are used in multiple places, then they should not be hard coded and it is recommended to use contexts.

These contexts are also useful to switch between environments (Development context then QA context then Production context).

Standard Job layout

Use a standard Job layout to ensure its readability, it is particularly useful for collaborative work.

Some examples include: putting data flows from left to right, top-to-bottom layout to show the process flow between subJobs, target components on the right, etc.

Complexity

Jobs should follow a logic and be split in steps, called subJobs, when necessary. It is also recommended to use parent Jobs to run one or several child Jobs in order to create a process flow and even though there is no limit, you should avoid using more than 20 components in a Job.

Once the Job is designed in a remote project from the Studio or the CommandLine (via the exportJob command), it can be published, deployed and executed in Talend Administration Center. Exporting the Job as an artifact will also help to perform Quality assurance tests on the same exact Jobs than those created in the Development environment. For more information, see Continuous Integration: Deploying to QA and Production environments.

For more information on how to export a specific Job, see the Talend Studio User Guide.

For more information on how to import a Job on the Job Conductor page in order to deploy it, see the Talend Administration Center User Guide.