On Tue, Jan 1, 2013 at 4:10 AM, Duy Nguyen <pclouds@xxxxxxxxx> wrote: > On Tue, Jan 1, 2013 at 11:15 AM, Duy Nguyen <pclouds@xxxxxxxxx> wrote: >>> Fix pack-objects to behave the way JGit does, cluster commits first in >>> the pack stream. Now you have a dense space of commits. If I remember >>> right this has a tiny positive improvement for most rev-list >>> operations with very little downside. >> >> I was going to suggest a similar thing. The current state of C Git's >> pack writing is not bad. We mix commits and tags together, but tags > > And I was wrong. At least since 1b4bb16 (pack-objects: optimize > "recency order" - 2011-06-30) commits are spread out and can be mixed > with trees too. Grouping them back defeats what Junio did in that > commit, I think. I think you misunderstand what 1b4bb16 does. Junio uses a layout similar to what JGit has done for years. Commits are packed, then trees, then blobs. Only annotated tags are interspersed with commits. The decision on where to place tags is different, but has a similar purpose. How blobs are written is very different, Junio's implementation is strictly better than JGit's[1]. So we can use pack ordering. There will be a gap because of tags, but if we assume there are less tags than commits, it will still be a reasonable cache file size. [1] I have known this since he was developing this commit. We talked about clustering by delta chain and the improvements it showed in CGit. I tried to implement a similar delta chain clustering in JGit but broke something in the packer and caused data corruption, so its stalled. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html