On Thu, 29 Nov 2007, Junio C Hamano wrote: > > I am hoping that "probably 10s of those 17s" can actually be measured > with the patch I sent out last night. Has anybody took a look at it? Sorry, I missed it. But I just did timings. Your patch helps git read-tree -m -u --exclude-per-directory=.gitignore HEAD HEAD timings enormously, and it's now down to 3s for me (which is the same speed as it is without any per-directory-excludes). That's a big improvement from the ~10s I see without your patch (I've repacked my tree, I have to admit that I don't even know if it's the new or the old older, but I can state that 7s for me was just those .gitignore files). Sadly, the full "git checkout" itself is not actually improved, due to the git update-index --refresh there, which will end up populating the whole directory cache anyway. I wonder why I didn't see that as the expensive operation when I timed "git checkout". Probably because I narrowed down on the "git read-tree" as the operation that actually accesses the pack-file and the object directory, while the "git update-index" never touches the actual objects. Anyway, I think your patch is great. It just doesn't help the full case of a "git checkout", only the read-tree portion of it ;( As to partitioning the data according to types: > When I do archaeology, I think I often run blame first to see which > change made the block of text into the current shape first, and then run > a path limited "git log -p" either starting or ending at that revision. > In that workflow, the initial blame may get slower with the new layout, > but I suspect it would help by speeding up the latter "git log -p" step. I really cannot convince myself one way or the other. I have a suspicion that sometimes it helps to have objects (regardless of type) close to each other, and sometimes it helps to have the trees packed densely. A lot of operations *do* work on both blobs and trees (a *raw* diff doesn't, but they are fairly rare), so this is not at all clear-cut like the commit case. So sorting the commits together is a no-brainer, since a lot of really important ops only look at them. But blobs and trees? The numbers certainly go both ways, and I suspect we are probably better off not messing with the sort order unless we have some unambiguous real results. Oh, well. I was hoping that I'd have a number of cases that showed good improvements, with perhaps the bulk of it not showing much difference at all. But while I saw the good improvements, the very first try at "git blame" also showed quite worse numbers, so I think we should consider it an interesting idea, but probably shelve it. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html