Re: [PATCH v4 00/13] New remote-hg helper

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 5, 2012 at 3:13 PM, Michael J Gruber
<git@xxxxxxxxxxxxxxxxxxxx> wrote:
> Felipe Contreras venit, vidit, dixit 02.11.2012 19:01:

>> I talked with some people in #mercurial, and apparently there is a
>> concept of a 'changelog' that is supposed to store these changes, but
>> since the format has changed, the content of it is unreliable. That's
>> not a big problem because it's used mostly for reporting purposes
>> (log, query), not for doing anything reliable.
>
> Is the changelog stored in the repo (i.e. generated by the hg version at
> commit time) or generated on the fly (i.e. generated by the hg version
> at hand)? See also below.

I don't know. I would expect it to be the former, and then when the
format changes, generated by the tool that did the conversion.

>> To reliably see the changes, one has to compare the 'manifest' of the
>> revisions involved, which contain *all* the files in them.
>
> 'manifest' == '(exploded) tree', right? Just making sure my hg fu is not
> subzero.

Yeah, the tree. As I said, it contains all the files.

>> That's what I was doing already, but I found a more efficient way to
>> do it. msysGit is using the changelog, which is quite fast, but not
>> reliable.
>>
>> Unfortunately while going trough mercurial's code, I found an issue,
>> and it turns out that 1) is not correct.
>>
>> In mercurial, a file hash contains also the parent file nodes, which
>> means that even if two files have the same content, they would not
>> have the same hash, so there's no point in keeping track of them to
>> avoid extracting the data unnecessarily, because in order to make sure
>> they are different, you need to extract the data anyway, defeating the
>> purpose.
>
> Do I understand correctly that neither the msysgit version nor yours can
> detect duplicate blobs (without requesting them) because of that sha1 issue?

That's correct.

> I'm really wondering why a file blob hash carries its history along in
> the sha1. This appears completely strange to gitters (being brain washed
> about "content tracking"), but may be due to hg's extensive use of
> delta, or really: delta chains (which do have their merit on the server
> side).

It is a surprise to me too. I see absolutely no reason why that would be useful.

It seems like bazaar does store the file hashes without the parent
info, like git.

>> Which means mercurial doesn't really behave as one would expect:
>>
>> # add files with the same content
>>
>>  $ echo a > a
>>   $ hg ci -Am adda
>>   adding a
>>   $ echo a >> a
>>   $ hg ci -m changea
>>   $ echo a > a
>>   $ hg st --rev 0
>>   $ hg ci -m reverta
>>   $ hg log -G --template '{rev} {desc}\n'
>>   @  2 reverta
>>   |
>>   o  1 changea
>>   |
>>   o  0 adda
>>
>> # check the difference between the first and the last revision
>>
>>   $ hg st --rev 0:2
>>   M a
>>   $ hg cat -r 0 a
>>   a
>>   $ hg cat -r 2 a
>>   a
>
> That is really scary. What use is "hg stat --rev" then? Not blaming you
> for hg, of course.
>
> On that tangent, I just noticed recently that hg has no python api.
> Seriously [1]. They even tell us not to use the internal python api.
> msysgit has been lacking support for newer hg, and you've had to add
> support for older versions (hg 1.9 will be around on quite some
> stable/LTS/EL distro releases) after developing on newer/current ones.
> I'm wondering how well that scales in the long term (telling from
> git-svn experience: it does not scale well), or whether using some
> stable api like 'hgapi' would be a huge bottleneck.

I don't know. I have never really used mercurial until recently. I
don't know how often they change their APIs and/or repository formats.
I would say the burden of updating to newer APIs is probably much less
than the burden of implementing code that accesses their repositories
directly, and eventually possibly rewriting the code when they change
the format.

If we were to access the repository directly, I would choose to use
Ruby for that, but given that 'we' is increasingly looking like 'I'. I
probably wouldn't.

Cheers.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]