Re: Git commit generation numbers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 14, 2011 at 01:19:51PM -0700, Linus Torvalds wrote:

> >Out of curiosity, what don't you like about the generation cache?
> 
> The thing I hate about it is very fundamental: I think it's a hack
> around a basic git design mistake. And it's a mistake we have known
> about for a long time.
> 
> Now, I don't think it's a *fatal* mistake, but I do find it very
> broken to basically say "we made a mistake in the original commit
> design, and instead of fixing it we create a separate workaround for
> it".
> 
> THAT I find distasteful. My reaction is that if we're going to add
> generation numbers, then were should just do it the way we should have
> done them originally, rather than as some separate hack.
> 
> See? That's why I wouldn't have any problem with adding a separate
> cache on top of it, if it's really required, but I would hope that it
> isn't really needed.
> 
> So a cache in itself is not necessarily wrong. But leaving the
> original design mistake in place IS.

Thanks, that makes some sense to me.

However, I'm not 100% convinced leaving generation numbers out was a
mistake. The git philosophy seems always to have been to keep the
minimal required information in the DAG. And I think that has served us
well, because we're not saddled with cruft that seemed like a good idea
early on, but isn't.

Generation numbers are _completely_ redundant with the actual structure
of history represented by the parent pointers. Having them in there is
not about giving git more information that it doesn't have, but about
being a cheap place to stuff a value that is a little expensive to
calculate.

And so that seems a bit hack-ish to me.

I liken it somewhat to the "don't store renames" debate. We don't want
to crystallize forever in the history whatever crappy rename-detection
algorithm is done at the time of commit. We put the minimum amount of
information in the DAG, and it's the runtime's responsibility to get the
answer.

I think the decision is a little more gray with generation numbers,
because it's not about "you got this information with a wrong and crappy
algorithm" like it might be with rename detection, but rather "we're
sticking this redundant number in the commit object, and we assume that
it will always be useful enough to future algorithms to merit being
here".

> And fixing it really ended up being a very tiny patch, no?

Well, yes. But it also doesn't yield a 100-fold speedup in "git tag
--contains" for existing repositories. So it's not quite a full
solution.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]