Best Practices: Git with Talend - 8.0

English (United States)
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
Talend Studio
Design and Development

Best Practices: Git with Talend

Talend provides powerful Git version control functions that allow you to manage your project without using external Git tools.

There are lots of Git workflows. This article picks two of the most common workflows, the Git feature branch workflow and the Gitflow workflow, provides an overview of these two workflows, and guides you through examples on how to follow the workflows when using Git with Talend Studio.

For more information about the Git workflows, see Comparing Workflows.

For more detailed information about working with project branches and tags in Talend Studio, see Talend Data Fabric Studio User Guide.

GIT Feature Branch Workflow

The Git feature branch workflow is an efficient way to work with the team in Talend Studio. In this workflow, all feature development takes place on dedicated branches separate from the main branch.

Note: The default Git branch will appear as main, if any, instead of master in Talend Studio.

The feature branch workflow assumes a central repository and main represents the official project history. Each time developers start working on a new feature, they create a new branch instead of committing directly on their local main branch. Therefore, multiple developers can work on a feature or on multiple features at the same time without touching the main branch.

The following is an example of the type of scenario in which the feature branch workflow is used. This is one example of the many purposes this model can be used for.

Suppose there are two developers Andy and Lucy in a team working on the same feature talend-123.

Andy is first assigned to work on this new feature. He creates a new feature branch talend-123 based on main in Talend Studio and switches to the newly created branch.

Note: Feature branches should have descriptive names, such as the feature ID or the JIRA ticket number.

Andy is now logged into the Git local branch talend-123 in Talend Studio. At this point the branch only exists on Andy’s machine.

Andy does a push to create the feature branch on the remote Git repository using the Git push tool provided within Talend Studio.

Andy creates two Jobs job1 and job2, and the changes are automatically committed to the local branch when they are saved. Then he pushes the changes onto the remote Git branch manually using the Git push tool provided within Talend Studio.

Now Lucy is also assigned to work on this feature. She checks out the remote feature branch talend-123 as a local one, then she changes the Job job1 and creates a new Job job3 on the local branch.

While Lucy is working on the Job job1, Andy also makes a change to the Job job1, creates a Joblet joblet1 on the local feature branch, and then pushes the changes to the remote repository.

When Lucy tries to push her changes on the Job job1 and job3 to the remote branch talend-123, the push is rejected, because Andy has just pushed the changes on the Job job1 to the remote repository.

Lucy must first pull from the remote feature branch talend-123, and conflicts are found because both Andy and Lucy made changes on the Job job1.

Lucy needs to resolve the conflicts. In this case, Lucy accepts all changes from Andy and marks the conflicts as resolved. Pull completes when all conflicts are resolved. Lucy pushes her changes successfully.

For more information about resolving conflicts between branches, see Resolving conflicts between branches.

Now the remote feature branch talend-123 contains changes from both developers.

After finishing the feature development, the developer can then merge the changes on the remote feature branch talend-123 into the remote main branch by checking out the remote main branch as a local branch, executing a pull and merge request from the remote talend-123 branch, and doing a push from the local main branch to the remote main branch.

Gitflow Workflow

The Gitflow workflow defines a strict branching model designed around the project release. It provides a robust framework for managing large projects.

The Gitflow workflow does not add any new concepts beyond what is required for the feature branch workflow. Instead, it assigns very specific roles to different branches and defines how and when they should interact.

The Gitflow workflow still uses a central repository as the communication hub for all developers. And, as in the feature branch workflow, developers work locally and push branches to the central repository. The only difference is the branch structure of the project.

Instead of a single main branch, this workflow uses two branches to record the history of the project. The main branch stores the official release history, and the develop branch serves as an integration branch for features. All commits in the main branch should be tagged with a version number.

The following is an example of the type of scenario in which the Gitflow workflow is used.

Suppose two developers Andy and Lucy are assigned to work on two new features feature-123 and feature-456 respectively.

The administrator creates a develop branch based on main and pushes it to the remote Git repository in Talend Studio.

Andy creates a new branch andy-feature-123 based on the develop branch in Talend Studio and switches to the newly created branch.

Like Andy, Lucy creates a new branch lucy-feature-456 based on the develop branch in Talend Studio and switches to the newly created branch.

Now, Andy and Lucy work on new features and create Jobs on their local feature branches respectively, then they push changes to the remote repository.

While Andy is still working on his feature, Lucy finishes her feature and begins to prepare a release.

Lucy checks out the remote develop branch as a local branch and switches to the local develop branch.