Re: Git commit generation numbers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/20/2011 08:18 PM, david@xxxxxxx wrote:
On Wed, 20 Jul 2011, Phil Hord wrote:

On 07/20/2011 07:36 PM, Nicolas Pitre wrote:
On Wed, 20 Jul 2011, david@xxxxxxx wrote:

If the generation number is part of the repository then it's going to
be the same for everyone.
The actual generation number will be, and has to be, the same for
everyone with the same repository content, regardless of the cache used.
It is a well defined number with no room to interpretation.

Nonsense.

Even if the generation number is well-defined and shared by all clients, the only quasi-essential definition is "for each A in ancestors_of(B), gen(A) < gen(B)".

In practice, the actual generation number *will be the same* for everyone with the same repository content, unless and until someone develops a different calculation method. But there is no reason to require that the number *has to be* the same for everyone unless you expect (or require) everyone to share their gen-caches.

and I think this is why Linus is not happy with a cache. He is seeing this as something that has significantly more value if it is going to be consistant in a distributed manner than if it's just something calculated locally that can be different from other systems.

It will only be used locally, so it needn't be consistent with anyone else's.


if it's just locally generated, then I could easily see generation numbers being different on different people's ssstems, dependin on the order that they see commits (either locally generated or pulled from others)

If it's part of the commit, then as that commit gets propogated the generation number gets propogated as well, and every repository will agree on what the generation number is for any commit that's shared.

I agree that this consistancy guarantee seems to be valuable.

I can't see why.

Surely there will be a competent and efficient gen-cache API. But most code can just ask if B --contains A or even just use rev-list and benefit from the increased speed of the answer. Because most code doesn't really care about the gen numbers themselves, but only the speed of determining ancestry.

in that case, why bother with generation numbers at all? the improved data based heristic seems to solve that problem.

Does it? Surely the ruckus would've died down in that case. But I haven't been reading pu.

It seems to me that the main drawback to a gen-cache is that it slows down the first operation after even a local clone (with just hardlinks).

On the other hand, I see too many nails in the distributed-gen-numbers coffin: legacy commits can't catch up (and therefore suffer), and legacy clients can trash or corrupt even "new-style" commits.

Phil

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]