On Thu, 8 Jul 2010, Theodore Tso wrote: > > On Jul 7, 2010, at 1:45 PM, Jeff King wrote: > > > And of course it's just complex, and I tend to shy away from > > complexity when I can. The question to me comes back to (1) above. > > Is massive clock skew a breakage that should produce a few > > incorrect results, or is it something we should always handle? > > Going back to the question that kicked off this thread, I wonder if there > is some way that cacheing could be used to speed up the all cases, > or at lest the edge cases, without imposing as much latency as tracking > the max skew? i.e., some thing like gitk's gitk.cache file. For bonus > points, it could be a cache file that is used by both gitk and git tag > --contains, git branch --contains, and git name-rev. > > Does that sound like reasonable idea? I don't think any caching would be as good as fixing the fundamental issue. Git is fast, sure. But it could be way faster yet in its graph traversal. And my pack v4 format is meant to overcome all those obstacles that Git currently has to work through in order to walk its commit graph. Once one realize that most of the commit object headers are SHA1 reference which need no be compressed with zlib as it is done now, and that the author and committer info can be factored out in a dictionary table, and that even those SHA1 references can be substituted with an index value into the pack index file (a bit like the OFS variant of the delta object), meaning that even the object lookup could be bypassed, then it would be possible to make graph traversal a magnitude cheaper in terms of computing cycles and memory touched. The pack format v4 has been brewing in my head for... well... years now. And that is good because I've improved on the original v4 design even more since then. And I even found some time to write more code lately. I have the new object encoding code almost working for trees and commits. My Git hacking time is still limited so this is progressing slowly though. Just to say that I don't think any kind of caching might be necessary in the end, as it is possible to encode object data in a pack in a way that ought to be about as fast to access as a separate cache would. So if someone is pondering about working on a cache layer, then I'd have one alternate suggestion or two for that person. ;-) Nicolas -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html