On Sat, 10 Feb 2008, Linus Torvalds wrote: > > On Sun, 10 Feb 2008, Jakub Narebski wrote: > > > > P.S. Would having generation + roots be enough? > > I'm wavering here. Maybe just "generation" works (the longest path from > any root), because what we are looking for is essentially "guarantee that > this commit cannot possibly be reached from that other commit", and I > guess a simple generation count actually does work for that (if the > generation of "x" is smaller than the generation of "y", we definitely > cannot reach y from x). Well, "generation" number alone would work quote well as an exclusion mechanism; generation + roots would work better, I think. Lets take for an example the following revision graph: roots: a a a aA aA gen: 1 2 3 4 5 a----b----c----d----e / A----B--/ gen: 1 2 roots: A A For example lone generation number is enough to decide that 'c' (generation 3) cannot be reached from 'a' (generation 1 < 3), and that 'c' (generation 3) cannot be reached from 'B' (generation 2 < 3). Roots allow for easy check that 'B' (gen: 2, roots: A) cannot be reached from 'c' (roots: a, and A \not\in a), but can be reached from 'e' (gen: 5 > 2, roots: aA \ni a). What I don't know if generation number would be enough to avoid "going to root" or "going to common ancestor" costly case when calculating excluded commits. > At the same time, I'm still not really convinced we need to add the > redundant info. I do think I *should* have designed it that way to start > with (and I thought so two years ago - blaah), so the strongest reason for > "we should add generation numbers" at least for me is that I actually > think it's a GoodThing(tm) to have. While this information can be calculated from revision graph it is I think costly enough that it truly would be better to have it in commit object. Well, we could always start using core.repositoryFormatVersion ;-) > But adding it is a pretty invasive thing, and would force people to > upgrade (it really isn't backwards compatible - old versions of git would > immediately refuse to touch archives with even just a single top commit > that has a generation number in it, unless we'd hide it at the end of the > buffer and just uglify things in general). Well, we could always add it as a local (per repository) "cache". With only generation numbers we could use pack-index-like format to store a mapping "commit sha-1 => generation number", just like now pack index stores mapping "object sha-1 => offset in pack". If we want to store also roots, we could either map "commit sha-1 => generation number, roots set offset / id" (constant length value)[*1*], or have gen-*.gen file with generation numbers and roots, and gen-*.idx as index to that file. [*1*] If I understand math correctly it would limit us in theory to up to 64 roots (git.git has 8 roots IIRC). -- Jakub Narebski Poland - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html