Re: Git commit generation numbers

Phil Hord <hordp@xxxxxxxxx> · Thu, 21 Jul 2011 12:24:01 -0400

On 07/21/2011 11:57 AM, Drew Northup wrote:
On Thu, 2011-07-21 at 08:55 -0400, George Spelvin wrote:
I have not read yet one discussion about how generation numbers [baked
into a commit] deal with rebasing, for instance. Do we assign one more
than the revision prior to the base of the rebase operation or do we
start with the revision one after the highest of those original commits
included in the rebase? Depending on how that is done
_drastically_different_ numbers can come out of different repository
instances for the same _final_ DAG. This is one major reason why, as I
see it, local storage is good for generation numbers and putting them in
the commit is bad.
Er, no.  Whenever a new commit object is generated (as the result
of a rebase or not), its commit number is computed based on its
parent commits.  It is NEVER copied.
I don't see the word "copy" in my original.

B-O1-O2-O3-O4-O5-O6
  \
   R1----R2-------R3

What's the correct generation number for R3? I would say gen(B)+3.
And you would be correct if you follow the SoP algorithm.

My
reading of the posts made by some others was that they thought gen(O6)
was the correct answer. Still others seemed to indicate gen(O6)+1 was
the correct answer.
Maybe the confusion comes from the different storage mechanisms being 
discussed.  If the generation numbers are in a local cache and used by a 
single client, the determinism of the specific numbers doesn't much 
matter.  If they are part of the commit, it still doesn't need to be 
completely deterministic. However, interoperability requires standards, 
and standards favor determinism, so dogmatic determinism may triumph in 
that case.

1. gen(06) might make sense if you mean to implement --date-order using 
gen-numbers, for example.  But I don't think it's practical in any case.

2. gen(06)+1 might make sense if you mean to require that gen-numbers 
are unique per repo.  But this is both unsupportable and unnecessary, so 
it's a non-starter.

3. gen(B)+1 is what you'd get from the the algorithm I saw proposed.

All three of these are provably correct by my definition of "correct": 
"for each A in ancestors_of(B), gen(A) < gen(B)".

However, [1] and [2] have some extra features of dubious value.  Simpler 
is better for interoperability, so I like [3] for this purpose.

Even [3] has an extra feature I think is unnecessary: determinism.  If 
that "requirement" is dropped, I think all three of these algorithms are 
(functionally) roughly equivalent.

I don't think everybody MEANT to be saying such
different things--that's just how they appeared on this end.

Now, did you mean something different by "commit number?"

I remain unconvinced that there is value in gen-number distribution, so 
to my mind, the specific algorithm and whether or not it is 
deterministic are unimportant.

Phil ~ who wasn't really being asked, but felt like answering

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html