Re: [PATCH] Add new git-remote-hd helper

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 18, 2012 at 3:18 PM, Michael J Gruber
<git@xxxxxxxxxxxxxxxxxxxx> wrote:
> Felipe Contreras venit, vidit, dixit 17.10.2012 14:58:
>> Signed-off-by: Felipe Contreras <felipe.contreras@xxxxxxxxx>
>> ---
>>
>> I've looked at many hg<->git tools and none satisfy me. Too complicated, or too
>> slow, or to difficult to setup, etc.
>
> It's in an unsatisfying state, I agree. We have a great remote helper
> infrastructure, but neither git-svn nor git-cvsimport/export use it, and
> remote-hg is not in git.git.
>
> git-svn used to be our killer feature! (It's becoming hard to maintain,
> though.)
>
> git-hg (in the shape of a remote helper) could be our next killer
> feature, finally leading to our world domination ;)
>
>> The only one I've liked so far is hg-fast-export[1], which is indeed fast,
>> relatively simple, and relatively easy to use. But it's not properly maintained
>> any more.
>>
>> So, I decided to write my own from scratch, using hg-fast-export as
>> inspiration, and voila.
>>
>> This one doesn't have any dependencies, just put it into your $PATH, and you
>> can clone and fetch hg repositories. More importantly to me; the code is
>> simple, and easy to maintain.
>
> Well, it still has hg as a dependency. All our remote
> integration/helpers suffer from that. At least, every hg install comes
> with the modules, whereas svn is a beast (apr and such) that often comes
> without the language bindings.

Yeah, but there's no way around that, even if we write some code to
access the repository, it's quite likely that it will change. Unlike
git, mercurial developers expect people to access the repository
through mercurial API, not directly, and that's probably what we
should do if we want to avoid problems.

>>  contrib/remote-hd/git-remote-hg | 231 ++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 231 insertions(+)
>>  create mode 100755 contrib/remote-hd/git-remote-hg
>
> That diffstat looks great (sans tests, of course; it's contrib), but
> there's no push functionality, and that is usually the most difficult
> part all helpers have to implement: Not only the push interaction, but
> also making sure that commits don't get duped in a roundtrip (git fetch
> from vcs, push to vcs, git fetch from vcs).

Yeah, it will probably require much more code, but I think by far the
most important feature is fetching.

> Just cloning and fetching can be done most easily with a shared worktree
> and scripting around "hg up" and "git commit -A" in some flavour.

Yea, but that will be dead slow.

>> diff --git a/contrib/remote-hd/git-remote-hg b/contrib/remote-hd/git-remote-hg
>> new file mode 100755
>> index 0000000..9157b30
>> --- /dev/null
>> +++ b/contrib/remote-hd/git-remote-hg
>> @@ -0,0 +1,231 @@
>> +#!/usr/bin/python2
>> +
>> +# Inspired by Rocco Rutte's hg-fast-export
>> +
>> +# Just copy to your ~/bin, or anywhere in your $PATH.
>> +# Then you can clone with:
>> +# hg::file:///path/to/mercurial/repo/
>> +
>> +from mercurial import hg, ui
>> +
>> +import re
>> +import sys
>> +import os
>> +import json
>> +
>> +def die(msg, *args):
>> +    print >> sys.stderr, 'ERROR:', msg % args
>> +    sys.exit(1)
>
> While we don't have to code for py3, avoiding '>>' will help us later.
> (It got removed in py3.) sys.sdterr.write() should be most portable.

All right.

>> +def gitmode(flags):
>> +    return 'l' in flags and '120000' or 'x' in flags and '100755' or '100644'
>> +
>> +def export_file(fc):
>> +    if fc.path() == '.hgtags':
>
> Is this always relative? Just wondering, dunno.

AFAIK, why wouldn't it?

>> +def get_repo(path, alias):
>> +    myui = ui.ui()
>> +    myui.setconfig('ui', 'interactive', 'off')
>> +    repo = hg.repository(myui, path)
>> +    return repo
>
> Is there a reason to spell this out, e.g.: Why not
>
> return hg.repository(myui, path)

No reason, but I already have patches on top of this.

>> +def hg_branch(b):
>> +    if b == 'master':
>> +        return 'default'
>> +    return b
>> +
>> +def git_branch(b):
>> +    if b == 'default':
>> +        return 'master'
>> +    return b
>> +
>> +def export_tag(repo, tag):
>> +    global prefix
>> +    print "reset %s/tags/%s" % (prefix, tag)
>> +    print "from :%s" % (repo[tag].rev() + 1)
>> +    print
>> +
>> +def export_branch(repo, branch):
>> +    global prefix, marks, cache, branches
>> +
>> +    heads = branches[hg_branch(branch)]
>> +
>> +    # verify there's only one head
>> +    if (len(heads) > 1):
>> +        die("Branch '%s' has more than one head" % hg_branch(branch))
>
> We have to deal with this at some point... Do you support "hg
> bookmarks"? They'd be an option, or we implement better detached head
> handling in git...

I already updated this, I converted it to a warning and just picked a
random head.

Adding support for bookmarks should be easy, it's just a matter of
deciding how to differentiate branches from bookmarks. Perhaps
'refs/heads/bookmarks/foo'.

>> +    revs = [rev for rev in revs if not cache.get(rev, False)]
>> +
>> +    for rev in revs:
>
> Those lines set up the list just so that you iterate over it later.
> Please don't do this (unless you know that revs is very small).
>
> for rev in revs:
>   if cache.get(rev, False):
>     continue
>
> is more performant. You can reduce this further to
>
> count=0
> for rev in repo.revs('%u:%u' % (tip, head)):
>   if cache.get(rev, False):
>     continue
>
> which is even more performant generally, and especially if repo.revs()
> returns an iterator rather than a list.
>
> [Yes, you could use lambda+filter, but let's not get religious. The
> above is simple and pythonic.]

And we wouldn't be able to calculate the progress: len(revs).

>> +
>> +        c = repo[rev]
>> +        (manifest, user, (time, tz), files, desc, extra) = repo.changelog.read(c.node())
>
> Same here, you introduce c just for the next line (unless I'm mistaken).

The second line is already too big.

>> +            print "from :%s" % (parents[0] + 1)
>
> What is this +1 offset that is appearing here and there?

Marks start from 1, no? If not, this can be removed.

>> +def do_list(repo, args):
>> +    global branches
>> +
>> +    head = repo.dirstate.branch()
>> +    for branch in repo.branchmap():
>> +        heads = repo.branchheads(branch)
>> +        if len(heads):
>> +            branches[branch] = heads
>
> I'm a bit confused here. repo.brancheads() is a list, no? Is this the
> single head case only? I'd expect [0] of that, but you seem to be
> getting branch names (strings).

No, heads is a list of heads, as nodes.

> Also, if len(heads) == 0 then branches[branch] is undefined or stale. No?

Yes, but we don't care about those branches; they are closed.

> Overall, this looks like plain scripting in python rather than anything
> oo'ish, but that's okay and probably dictated by the remote helper
> interface and/or the existing exporter.

If there was an advantage of creating new classes, I would be all for
it, but I don't see any.

> I'm all for an improvement in that area, but still feel we'd need a
> combined effort rather than yet another start.

Me too, but as far a sverre's remote-hg code goes, there isn't even a
branch to work from.

Cheers.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]