Much attention has been given to the importance of leveraging one's technology base to achieve short product cycles through concurrent development. However, practical advice in this matter is hard to come by, and in my experience few organizations have been able to accomplish it successfully. I believe that the failure of most such development efforts are rooted in underestimating the subtle complexity of the problem, and accordingly addressing it with oversimplified methodologies that are not equipped to deal with the issues that arise.
On the bright side, though, the problem is not so complex as to be hopeless. In fact, there are existing methodologies and technologies that can achieve efficient and successful concurrent development when used judiciously and appropriately.
The problem of dealing with multiple concurrent contributors to a single project with well-defined goals and a single release event is actually a relatively simple one, and yet many organizations still get it wrong.
These problems are alleviated by using a locking methodology, such as RCS. Still, the possibility of deadlock and problem of checking in incompatible changes remain. Furthermore, the fact that files are still updated asynchronously makes it essentially impossible to achieve a self-consistent state without temporarily forbidding concurrent changes. Thus, incremental progress cannot be verified without impacting the rate of development significantly. For these reasons, this approach is still unsuitable for a project with many contributors and nontrivial interfaces.
Merging may require manual intervention, but when changes are independent merging is almost always automatic. Planning the work to be done such that most interdependent changes are not concurrent is recommended. (Interdependent changes are not to be confused with distinct components of a single change that may be submitted by multiple contributors, which should normally occur concurrently.) It is also recommended that each contributor have a separate tree for every unrelated change, such that each change can be submitted separately when it reaches maturity.
The fact that checking in before others means having to do less merging provides an incentive to complete changes rapidly, which tends to boost productivity. However, it can also lead to low quality submissions unless there are agreed-upon submission criteria.
Having a uniform minimum check-in regression for all submitters is a good idea, because it minimizes the probability that incorrect changes are ever checked in. (Being forced to incorporate somebody else's bugs into your changes is generally counterproductive.) The regression itself should be under source control, as it normally becomes more thorough as release approaches. Triggering the regression automatically may be advisable if contributors lack the discipline to run it voluntarily.
While static isolation solves many problems associated with coordinating development, it should not supplant out-of-band communication (i.e. talking and documenting). In particular, conflicting assumptions about project goals or conventions are still guaranteed to cause trouble.
Having multiple concurrent projects with different release dates and potentially conflicting goals poses a much more difficult challenge. Almost nobody gets this right on the first try. Static isolation among contributors is still recommended, but it is no longer sufficient to avoid problems.
Rather than maintaining previous releases as constant entities (which is supported by the source control system anyway), the shared portions of the tree are allowed to evolve over time, provided that a configuration satisfying the goals of each previous release is maintained. This permits global improvement without violating local requirements. (Obsoleting previous releases outright is to be avoided, because they tend to remain relevant in the marketplace long after developers have lost interest in them.)
One clear advantage of a unified code base is that if maintaining unification turns out to be unmanageable for some reason, it is still easy to fork the code base at any time. On the other hand, once you take the copy and run approach, it rapidly becomes prohibitively difficult to remerge the code base.
While the unified code base approach has been observed to succeed, it still has a number of issues:
Every project lives on its own branch off the trunk. Every branch has its own regression, which normally covers only the configuration of the current project. The trunk regression includes tests for every project. Changes propagate from a project branch into the trunk, and subsequently back down into each of the other project branches. Propagated changes that cause any regression to fail are flagged as issues to be resolved.
Contributors must still remain mindful of the impact of changes on all projects, but partitioning the regressions minimizes the risk of undetected errors without imposing enormous check-in latencies. To further minimize this risk, an intermediate branch between the trunk and the project branch with a very thorough regression may be added. However, this increases the latency of integration.
When project branches are first instituted, the task of change propagation and merging is likely to be neglected. However, provided that it is impressed upon contributors that merging is not optional, the fact that the first to propagate changes has the advantage of easier merging will provide ample incentive to propagate early and often.
When multiple contributors perform mutually required components of a given change, a development branch is created off the project branch. This branch is used as a place-holder to integrate those components until the result is expected to pass the project regression. Development branches typically lack a regression of their own.
A development branch can also be used to track the progress of a nontrivial change performed by a single contributor.
When release is too imminent to justify the risk of incorporating changes from another project, but changes to the current project are ongoing, a release branch is used. Delaying the propagation of changes is a bad idea, because it adds risk to the contributing project. However, the release branch can isolate those changes from the impending release of the recipient project. The subset of the changes in the release branch that are applicable to the unified code base must be manually propagated to the project branch.
Release branches should be created only on the basis of risk. Changes that are rejected on the basis of suitability should instead be altered such that the recipient project is unaffected.
| If the "right" way and the "safe" way to address a defect conflict, then the safe way belongs in the release branch, and the right way belongs in the trunk. |
The most common pitfalls with branching are to fail to account for what has already been merged, and to make changes to a project branch that are obviously unsuitable for the unified code base. Such mistakes tend to preclude merging, which defeats the advantages of a unified code base, so don't do that. If you're not up to the challenges of branching, then you're not up to the challenges of concurrent project development.
Anders Johnson, last modified $Date: 2002/02/05 $