Re: mercurial to git

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hopefully you won't mind that I'm adding the git list back to the cc
line, since it would be useful for others to provide some feedback.

On Thu, Mar 15, 2007 at 09:44:35AM +0000, Rocco Rutte wrote:
> >So I'll go try it out in the near future.  Are you planning on being
> >able to make it be bi-directional?  (i.e., so that changes in the git
> >tree can get propagated back to the hg tree?)
> 

> But as there's no hg-fast-import, I think git to hg not so trivial to 
> implement and convert-repo already exists, so I'd rather prefer 
> extending it to do the job.

Actually, there *is* an hg-fast-import.  It exists in the hg sources
in contrib/convert-repo, and it is being used in production to do
incremental conversion from the Linux kernel git tree to an hg tree.
So it does handle octopus merges already (it has to, the ACPI folks
are very ocotpus merge happy :-).

> However, I never even used hg and have only some knowledge about the API 
> so that I see some difficulties and need more time to think about it 
> (e.g. how to detect whether a change in hg originates at git and vice 
> versa, what to do with octopus merges, cherry-picks, etc).

So actually I have thought about this a fair amount, so if you don't
mind my pontificating a bit.   :-)

At the highest architectural viewpoint, there are three levels of
difficulty of SCM conversions:

A) One-way conversion utilities.  Examples of this would be the
	hg2git, hg-fast-import scripts that convert from hg to git,
	and the convert-repo script which will convert from git to hg.

B) Single point bidrectional conversion.  At this level, the hg/git
	gateway will run on a single machine, and with a state file,
	can recognize new git changesets, and create a functionally
	equivalent hg changeset and commit it to the hg repository,
	and can also recognize new hg changeset, and create a
	functionaly equivalent git changeset, and commit it to the git
	repository.  

C) Multisite bidirectional conversion.  At this level, multiple users
	be gatewaying between the two DSCM systems, and as long as
	they are using the same configuration parameters (more on this
	in a moment), if user A converts changeset from hg to git, and
	that changeset is passed along via git to user B, who then
	running the birectional gateway program, converts it back from
	git to hg, the hg changeset is identical so that hg recognizes
	is the same changeset when it gets propgated back to user A.

(C) would be the ideal, given the distributed nature of hg and git.
It is also the most difficult, because it means that we need to be
able to do a lossless, one-to-one conversion of a Changeset.  It is
also somewhat at odds with doing author mapping and signed-off-by
parsing, since that could prevent a reversible transformation.
However, what may very well be common for projects is for them to
start with (B), and to convert over some of the historical changesets,
and then later on allow multiple users to clone from the two git/hg
repositories and then do the multisite conversion.

So what that also means is that even if we only do (B) at first, it
might be useful if we have some of the characteristics needed to
eventually get to (C), even if we can't get there right away.

So more practially, here are some of the things that we would need to
do, looking at hg-fast-export:

*) Change the index/marks file to map between hg SHA hash ID's instead
of the small integer ordinals.  This is useful for enabling multisite
conversion, but it is also useful for tracking tag changes in .hgtags.

*) Have a mode so that instead of only checking changes greater than
last run, to simply iterate over all changesets in mercurial and check
to see if hg SHA1 commit ID is already in the marks file; if so, skip
it.  

*) Have a mode where the COMMITER id is "hg2git" and the COMMITER_DATE
is the same as the AUTHOR_DATE (so that the changelog converesion is
the same no matter where or who does the converation).  This is mainly
to enable multisite converstaion.

						- Ted
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]