On Sat, 27 Jan 2007, Simon 'corecode' Schubert wrote: > > git rev-list and git log (with or without -p) perform poorly when invoked with > a pathspec. Really? I would say exactly the opposite. They _smoke_ when invoked with a pathspec. Show me *one* other SCM that even comes close.. And please, realize that git does arbitrary combinations of directories, and not just single files. AND THAT IS IMPORTANT! Any SCM that can't do git log drivers/scsi/ include/scsi/ and have it be a sane log of the changes to the _union_ of those two directories is strictly inferior to what git can do. Usually this is something that others CANNOT DO AT ALL. Even your 1:18 number is a hell of a lot faster than "can't do it", which is what you have for everything else I can imagine. Maybe you just do single files, but my pathspecs tend to be directories or multiple files more often than single ones. How the heck did you intend to cache that? > I agreee with those numbers. However, on a converted KDE repo, they are > *completely* different: > > git log kdelibs/README takes 1:18. One minute, eighteen seconds. > git rev-list and git blame take roughly the same time. Do you have the converted repo somewhere to be cloned for? It's going to be a lot more interesting for scalability testing than anything else. It is possible, for example, that the real issue is that we shouldn't compress delta objects in a pack. > That's what we were getting at. Not the superiority of git blame (no irony) > and thus reduced speed, but the algorithmic deficiency of any operation on a > pathspec/object, which can be easily fixed. The thing is, one of the reasons the git object database is small is that it compresses really well, and I suspect that for the KDE repo, what you're seeing is really a combination of: - the KDE people were idiots in the first place to make it into one big repo - we've consciously made repo size be a major goal, and yes, we spend a lot of CPU as a result, following delta chains etc. The zlib overhead is more visible, because once you've uncompressed the delta the delta itself is really quick to apply, but the whole "trees compress really well" all boils down to the same thing: we create lots of small objects, and we have tons of deltas, and the hierarchical nature of the data structures (ie saving the trees not as one big manifest but as a more complex hierarchial datastructure) is what allows us to do tons of the path-based optimizations. But they all do end up boiling down to "we use lots of CPU". And I suspect tweaking the existing stuff is quite reasonable. But we need to have a public repo that people who want to tweak can play with (for example, the old "linux-history" archive was what made us tweak things like gitk, which was horribly horribly bad). So please point to a kde conversion archive to play with (maybe you have, I missed it). Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html