Re: Git commit generation numbers

Drew Northup <drew.northup@xxxxxxxxx> · Thu, 21 Jul 2011 08:03:09 -0400

On Wed, 2011-07-20 at 16:26 -0700, david@xxxxxxx wrote:
> On Wed, 20 Jul 2011, George Spelvin wrote:
> 
> >> The alternative of having to sometimes use the generation number,
> >> sometimes use the possibly broken commit date, makes for much more
> >> complicated code that has to be maintained forever.  Having a solution
> >> that starts working only after a certain point in history doesn't look
> >> eleguant to me at all.  It is not like having different pack formats
> >> where back and forth conversions can be made for the _entire_ history.
> >
> > It seemed like a pretty strong argument to me, too.
> 
> except that you then have different caches on different systems. If the 
> generation number is part of the repository then it's going to be the same 
> for everyone.

I keep hearing (reading) people stating this utterly unfounded argument.
The fact is that for any work not yet integrated back into a shared
repository it just isn't true--and even after upstream integration the
truth of such a statement may be limited.

I have not read yet one discussion about how generation numbers [baked
into a commit] deal with rebasing, for instance. Do we assign one more
than the revision prior to the base of the rebase operation or do we
start with the revision one after the highest of those original commits
included in the rebase? Depending on how that is done
_drastically_different_ numbers can come out of different repository
instances for the same _final_ DAG. This is one major reason why, as I
see it, local storage is good for generation numbers and putting them in
the commit is bad. 

I have no problem with putting an _advisory_ "revision number" in the
commit. It would not be expected to have a proper "1-to-1 and onto"
functional association with the _final_ DAG, but it could potentially
get us some nice benefits. We would still need to answer questions like
the one I ask above, but it would hurt less to change if we need to.

One other sane option that was mentioned at least once in passing was to
store the generation number in some Git "filesystem-level" object. This
could then be reconciled with each "git gc" or "git fsck" operation if
not more often. This is less ad-hoc and messy than a separate cache,
becomes amenable to the standard tool-set, and always gets updated (no
invalid cache). If an _advisory_ revision number is available in commits
that are sent along those could conceivably be used to help build up the
local git-fs generation numbers more quickly. (If a "git pull" is issued
to our repo, or we push to another, we don't send the generation numbers
locally stored--we expect the git-fs machinery to regenerate those on
the fly.)

I may not be one of the "resident rocket scientists," but that's how I
see it.

-- 
-Drew Northup
________________________________________________
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html