Re: Savannah has Mercurial!

Andrew Haley <aph@xxxxxxxxxx> · Fri, 06 Jun 2008 11:00:49 +0100

Roman Kennke wrote:
> Hi Andrew,
> 
>>>> What do people think to the idea of switching?  Maybe post 0.98?
>> Mercurial is a disaster as far as I can see.  It doesn't seem to be
>> possible to work locally and merge back into the trunk without having
>> to do a complex and error-prone three-way merge, and several times
>> I've got into a state that it was impossible to recover from, even
>> with the help of the best Mercurial experts we have at RH.  The only
>> way people work successfully is to merge from trunk, check in, and
>> then push their changes immediately to the mater repo before someone
>> else does any updates to the same files.  If you don't get in fast
>> enough, merge time.  This approach doesn't scale at all.
> 
> I disagree. I think this impression stems from the attempt to map CVS
> development behaviour to Mercurial (or any other DVCS). I think the way
> that HG handles this is conceptionally better than CVS. Think about it:

> Developer A and B both clone the repository at Changeset (CS) 1. Both
> make changes and end try to push them back. We end up with:
> 
>  /-2
> 1
>  \-3
> 
> This needs to be merged, no matter if the changes are disjoint or not.
> Merging disjoint changes is easily done automatically. Now you could
> argue that CVS does this automatically when committing. I argue that
> this is not a good idea, because even disjoint changes might lead to a
> broken tree (although it doesn't happen that often in a well-structured
> project like Classpath. But I've run into it several times). Even worse,
> with CVS you basically loose some in-between information (one of the CSs
> automatically get merged, while with HG you retain all the changesets
> and get one additional merge CS).

I'm not going to try to defend the worst points of CVS, such as its non-
atomic commits; a commit should either commit all files or fail.  I use
svn, and it does the right thing, more or less all the time.

The problem with Mercurial is that its main advantage, that of being
distributed, is much less of a big deal than the basic day-to-day business
of commits, merges, branches and so on, almost all of which AFAICS it
does worse than svn.  Even when I was new to CVS (and svn) I never managed
to get into a state where I had to abandon my changes and start again.

> The problem you describe occurs when too many people work on one
> 'trunk'. Then it is possible that you run into a kind of race. E.g. you
> try to push and get aborted because somebody else pushed in between. You
> do fetch (which is a great help for such situations instead of pull &
> update & merge & commit), and try to push again and fail again, because
> again, somebody has pushed stuff. I agree, that this kind of development
> doesn't scale beyond a handful of developers.

> I see two practical solutions to that. A project with so many developers
> should be structured on the level of the VCS. AFAICS, there are two
> reasonable approaches to that:
> 
> 1. the almost-anarchistic Linux kernel model: Every core developer (or
> maintainer) basically maintains his own public tree and pulls all the
> changes he cares about from somewhere else (mailing lists, other repos,
> etc). The release is rolled from one of those trees (Linus' tree in the
> case of the kernel).
> 2. A hierarchical model like Sun does for OpenJDK. Developers never push
> to the master repository, but instead work in their group repository.
> This decreases the size of each group to a reasonable level, thus
> avoiding the above problem to a great degree. Pushing then means to do
> hg fetch && hg push. A maintainer guy then pulls all the changes into
> the master tree on a regular basis, at which point only disjoint CSs are
> merged, and he can be reasonable certain that the thing ends up in a
> consistent way (because the groups have reasonable stable interfaces
> against each other). As a bonus, the maintainer guy should run a test
> before pushing the stuff to the master repo.

This is more or less what we do with gcc, except that we don't need to have
multiple repos because svn has zero-overhead branches.  Any developer
can create a branch any time they like, and work on that branch as though
it were their own repo, all changes are tracked, and everything is safely
backed up.  Also, everyone gets to see and share the work on the branch.

Andrew.

> 
> 
>>> I wouldn't object (although I don't really have trouble with CVS at this
>>> point with classpath). So if enough developers think it is a positive
>>> switch lets do it. We would be the second project on savannah though, so
>>> expect some first adopter issues.
>>>
>>> The only thing we have to really look out for is doing a good
>>> conversion, some experimentation with hg convert and/or tailor might be
>>> necessary.
> 
> I think both tailor and hg convert are pretty good and yield reasonable
> results for large-scale conversions. I think tailor is a little more
> stable. I did conversion of Jamaica using tailor and had no problems so
> far. It makes sense to use username mapping, if you want full usernames
> instead of the CVS shorts.
> 
>>> Also a better understanding (best practices for) release branches would
>>> be nice. I found the in-tree branching of mercurial somewhat confusing
>>> at times, so it would be good to make sure we have clear guidelines for
>>> those who want to do (release) branches on the tree would be nice.
> 
> I prefer separate trees for branches. In-tree branches are only
> confusing and don't make so much sense. Same for tags. In my experience,
> trying to map CVS behaviour in this respect is only confusing and
> creates the impression that HG is in some way not mature. I had a lot of
> trouble in the autobuild infrastructure of Jamaica, which sets and
> deletes tags, and found that using clones for that is the only
> reasonable solution, because a shell script cannot reliably merge things
> (like tags and branches).
> 
> Cheers, Roman
>