> Currently, the format includes 8 bytes to share between the generation > number and commit date. Due to alignment concerns, we will want to keep this > as 8 bytes or truncate it to 4-bytes. Either we would be wasting at least 3 > bytes or truncating dates too much (presenting the 2038 problem [1] since > dates are signed). Good point. I forgot about them while writing the previous email. That is reason enough to keep the generation numbers, sorry for the noise. > >> I only glanced at the paper, but it looks like a "more advanced 2d >> generation number" that seems to be able to answer questions >> that gen numbers can answer, but that paper also refers >> to SCARAB as well as GRAIL as the state of the art, so maybe >> there are even more papers to explore? > > > The biggest reason I can say to advance this series (and the small follow-up > series that computes and consumes generation numbers) is that generation > numbers are _extremely simple_. You only need to know your parents and their > generation numbers to compute your own. These other reachability indexes > require examining the entire graph to create "good" index values. Yes, that is a good point, too. Generation numbers can be computed "commit locally" and do not need expensive setups, which the others presumably need. > The hard part about using generation numbers (or any other reachability > index) in Git is refactoring the revision-walk machinery to take advantage > of them; current code requires O(reachable commits) to topo-order instead of > O(commits that will be output). I think we should table any discussion of > these advanced indexes until that work is done and a valuable comparison can > be done. "Premature optimization is the root of all evil" and all that. agreed, Stefan