Shawn Pearce wrote: > On Mon, Nov 22, 2010 at 11:53 PM, Jonathan Nieder <jrnieder@xxxxxxxxx> wrote: >> Other aspects to investigate: choice of hash function; > > Why? SHA-1 is pretty uniform in its distribution. I got distracted for a moment by the atom table, but since that does not have a big effect on performance it's probably not worth spending time on. Sorry about that; please ignore. [...] > The way I > read this store_tree() code, every subdirectory is recursed into even > if no modifications were made inside of that subdirectory during the > current commit. Doesn't the is_null_sha1 check avoid that? To further explain the workload: svn-fe receives its blobs from svn in the form of deltas. So the conversation might go like this: S commit refs/heads/master S mark :10000 S committer felicity <felicity@local> S data 74 S bug 3097: switch spamd from doing 'fork per message' to a 'prefork' model S cat incubator/spamassassin/trunk/spamd/spamd.raw F 89d56462577b8b7b4f4115f2a47f0b3da22b791a blob 63633 F #!/usr/bin/perl -w -T ... S M 100644 inline incubator/spamassassin/trunk/spamd/spamd.raw S data 62114 ... Current svn-fe in vcs-svn-pu requests the preimage blobs using marks, but the idea is the same. If this proves a bottleneck I suppose we could cache the content of frequently-requested old blobs and keep pointers to that in the in-core tree. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html