This post is from the CollabNet VersionOne blog and has not been updated since the original publish date.
Git Workflows, Branching & Merging Q&A
CollabNet kicked off the new year with a 3 part series called Go Agile with Git on January 15, 2013. This series is designed as a crash-course on managing Git workflows and continuous branching and merging in Agile software development, then explores the power of code review with Gerrit and Jenkins. Part one of this series was on Git workflows, branching and merging and had over 200 webinar participants who attended! Thank you to everyone that attended, but if we missed you please register for the on-demand replay.
Because of the large audience and limited time constraints, not all of the audiences’ questions were able to be answered during the live presentation, until now…
Our guest presenter, Luca Milanesio, is pleased to elaborate on all questions and comments from the webinar. Luca is Director and cofounder of GerritForge LLP, key technology partner of CollabNet, and has over 20 years of experience in development management, software configuration management and software development lifecycle management in large enterprises worldwide.
Here is the complete log of questions and answers by Luca:
Q: Do you have any tool to migrate source code of one SCM tool to GIT ?
A: Git native distributions already include tools that allow to fetch code from external SVN repository and push to a Git, including their history and branches. For other SCMs (i.e. CVS, Perforce, TFS) there are OpenSource and commercial tools to migrate the full history of your repository into Git.
Q: Gerrit Dashboards already works?
A: Dashboards have been introduced in Gerrit in November 2012 hackathon and are one of the major features of Gerrit 2.6.
Q: how can use GIT to map it ClearCase UCM
A: Git is not really comparable to ClearCase UCM. Gerrit Code Review plus the additional ALM features provided by CollabNet TeamForge or the Enterprise Issue-Tracker integrations of GerritForge LLP, can really be mapped to ClearCase UCM concepts as both define the concept of “project” and “tracking” between code changes and work items into your development lifecycle. Git can be mapped to ClearCase whilst Gerrit and its Issue-Tracker integrations map to UCM with the role of associating one or more Git commits to a delivery or set of issues tracked in ClearQuest.
Q: will we cover Gerrit and AD integration? or Git AD integration ?
A: Git does not provide AD integration out-of-the-box, unless integrated and customised through an HTTP front-end (i.e. Apache) to resolve user credentials authentication. Gerrit provides AD and industry-standard LDAP support and will be covered in the second and third webinars in the series.
Q: are there any danger when auto-merging?
A: It depends on whether you have Gerrit triggering your Jenkins CI to “protect” from wrong auto-merges or if you just rely on nightly builds and standard QA validation: when Gerrit and Jenkins are integrated, even auto-mergine is not a risk as the consistency of the code-merge is validated. Typically a code-review workflow requires the change to be already “rebased” on the branch candidate to receive the change, otherwise you would run into the risk of validating a change on the wrong context.
Q: If someone commits a huge binary file by accident and pushes it to the origin/master, can that be undone or is it there to stay forever? Or is this also covered in history protection link?
A: Git always allows to “remove” commits permanently, using “git reset” or “git rebase -i” (interactive rebase) to remove or amend the commit containing the huge binary file. However Git keeps the binary blob into its objects until garbage collection will trigger an physical removal of all unreferenced objects, including the huge binary file. With CollabNet TeamForge history protection you have full control of when and what to remove completely from your Git history (in this case the huge binary file).
Q: Can roles in Gerrit be defined in ldap? If so, what are the requirements? Can roles in Gerrit be defined in an external source?
A: Gerrit allows the usage of RBAC (role-based access control) through its group level permissions. Groups can be defined in LDAP and Gerrit Ver. 2.5 can use them and retrieve them dynamically using the its pluggable “group backend” infrastructure. Groups LDAP entry points need to be configured in gerrit.config in order to be fetched by Gerrit. Similarly other external sources can be plugged as “group backend” into Gerrit by implementing the connector interface to the external roles as Gerrit plugin.
Q: You mentioned Jenkins fits the best here. How do you compare it with Team City in this regard?
A: TeamCity is a very advanced CI system and has introduced the concept of “test build” and “private builds” to prevent faulty code to break the team build, same battleground of Code Review to improve Team Agility without loosing control and code quality. Unfortunately TeamCity is not OpenSource and thus has a more limited community of plugin developers providing integrations for Git and Gerrit. Jenkins CI has the world’s largest community of plugin developers for a Continuous Integration engine and already provides support for Gerrit Code Review workflow.
Q: How many people voted on Gerrit poll?
Q: What is the minimum number of branches needed if two geographically seperated teams are sharing the same codebase and committing to the same codebase? e.g. production, master, dev1, de2, etc. How many should we maintain to be minimal?
A: Minimum number of branches is actually one, the master. Two repositories in two locations have already a “copy” of the “master” branch and then are effectively different branches that can be merged during the repository synchronisation process across locations. In practice it is recommended to have at least one active development branch (“master”) and then one branch per stable release, independently from the number of geographically separated teams. Additionally you may want to develop features in parallel using the “topic” branches and then having them merged into the “master” as they are validated through Code Review by the Teams.
Q: We are about to start the migration from SVN to GIT and are worried about the best way to train the developers – can you give any tips of the best approach
A: Git requires a lot of “day-by-day” support from the beginning, especially for Teams coming from a SVN experience. Concepts are different and sometimes SVN command names have a different meaning and behaviour in Git. It is recommended to carefully tune Gerrit permissions to avoid “dangerous” pushes from “beginners” Team members (i.e. disabling forge identity and forced push) and start with a dedicated Git training to a group of “champions” that will be then spread across all of the Teams to provide support. Those “champions” will become then the “day-by-day” tutorials of the other Team members learning how to use Git in everyday work. P.S. Stick the “Git cheat sheet” on the wall of every Team room to enforce the Git development SCM workflow.
Q: How do users use git on windows systems? Is there a tool like Tortoise?
A: Yes and its name is TortoiseGit (no surprise here !) but it is only a GUI front-end to Git for Windows that needs to be installed as pre-requisite. However I do encourage the Teams to use the Git command line in order to first understand the concepts and then to learn how to get the best from the tool when they need it. For most common operations they can use TortoiseGit or any other Git client GUI tool (personally I use and like a lot GitX on MacOS) but they need to understand the underlying commands that are executed and generated as Git actions on the repository.
Q: How to find out which files someone has edited over history of project? It is easy in a specific commit, but seems hard over the life of the project …
A: Git is a very powerful tool designed to provide full history inspection and code search. Each command is very sophisticated and flexible to perform even complex tasks. This specific task mentioned is not complex for Git, the following example provides a list and stat of lines changed by firstname.lastname@example.org over the current branch:
git log –author email@example.com –numstat
Q: Do you have any best practices to migrate from Subversion to GIT?
A: My best suggestion is to assure consistency of Subversion vs Git usernames and e-mails. You wouldn’t like to loose visibility and association of who made a commit in SVN and his identity in Git. In order to enforce consistency make sure that “forge identity” permission is disabled when start using Git after the initial migration, in order to keep your ex-SVN users always aligned with their own identity in Git. “Forge identity” can then be granted gradually as developers become more aware and familiar with the Git tool.
Q: C1 to C5 is like Team Branch. C6, C7 is like Dev Branch.
A: C6-C7 is more like a “topic-branch” where one or more developers are working on a specific feature. Master is the main development branch.
Q: Why do we need a Contributor who is NOT a commiter?
A: Anybody in the Team or outside the Team (when authorised to do so) should be able to provide their ideas and possibly even their fixes to the code; however you do not necessarily want to allow everybody to commit and merge a change (potentially flawed) and break the Team master build, causing delay on your project sprint. Defining “contributors” you can allow everybody to provide their changes into the Team’s discussion and promote them through automatic validation (Jenkins CI builds) and collective code-review, avoiding the risk of breaking the build by unwanted or not validated changes. You can then differentiate them from the “Senior Team Members” (Committer) who have the technical ownership of the design and timelines of the Team Project.
Q: any comment on gerrit vs sonar code reviews?
A: Sonar is a fantastic infrastructure to trigger code quality checks but not necessarily to action on code promotion. Gerrit is the place where people decide what to do with regards to a change, whilst Sonar is only providing a feedback on “how the change looks like” without any active review action on it. However it would be possible to integrate Sonar feedback into Gerrit code-review lifecycle developing a Sonar plugin for Gerrit or simply getting Sonar feedback into Jenkins build and then providing a validation result to Gerrit through the Jenkins Gerrit plugin.
Q: In the Git merge slide, what’s the recursive option?
A: Recursive merge strategy (the default merge option in Git) allows automatically to detect complex branches history and merge them together by walking recursively into each branch history and looking by a common ancestor branch point.
Q: does Gerrit only works with Git?
A: Yes, mainly because the it needs all the power and ability of Git to merge and rebase changes once they have been reviewed, approved and then submitted. Additionally Gerrit uses Git ability to define custom “hidden branch namespaces” to store Security and Roles definition of the repository itself.
Q: will Gerrit work with Git Fusion? Do we need a mirror repository in that case?
A: Perforce Git Fusion, used as “blessed Git repository” is a Perforce instance that “talks” the Git protocol to allow cloning and pushing to it. However Gerrit uses JGit to access its Git repository on the Filesystem and cannot then use a Perforce backend. However Perforce Git Fusion could be used as synchronisation peer of a Gerrit repo, through the usage of the Git replication protocol.
Q: how does the distributed scm work in git when the development teams are spread across geographical locations?
A: You typically configure one Gerrit replicated instance per geographical site, in order to allow maximum performance on the “git pull” operation. However for minimising conflicts during geographical repository replication, it is important to perform “git push” to only one of the repositories replicas; all the others will get the changes eventually thanks to the Gerrit replication events being propagated.
Q: how does git auto resolve work, how does git know which files to merge first in order to keep all changes from all submits?
A: Git has a lot of powerful commands and hidden (or semi-hidden) functionalities, waiting to be discovered and leveraged !
I think you refer here to the “git automatic conflict resolution” (aka git rerere) described in this Git documentation link. Git stores the “history” on how a conflict was resolved in the past and then reuse the same info to resolve future conflicts. Functionality is normally disabled but can be turned on with the following command:
git config –global rerere.enabled true
Q: Can you please explain what’s ther diff between Recursive Merge vs Rebase? the diagrams look very similar
A: There is one fundamental difference between “merging” and “rebasing” two branches: in the first case the merge preserves the full history of the two branches and creates one extra commit containing the “merged code”; in the second case the rebase modifies the history of the branch by “replaying the changes” on top of the target branch, as you were in a time-machine and you would push changes into a future state and the branch point was never created in the past. In a nutshell, the “merge” can be reverted because doesn’t change the history (removing the merged commit would suffice) whilst a “rebase” cannot be reverted easily as the “rebased branch” history has been modified and reapplied in the future. Rebase however has the effect of “flattening up” the branch history and allowing a more concise and readable set of commits on a development branch.