On Fri, Apr 17, 2015 at 05:30:22PM +1000, Stefan Saasen wrote: > The merge is created in a temporary location that uses alternates. The > temporary repository is on a local disk, the alternate object database > on an NFS mount. Is the alternate writeable? If we can't freshen the object, we fall back to storing the object locally, which could have a performance impact. But it looks from your tables below like the utime() call is succeeding, so that is probably not what is happening here. > My current hypothesis is that the additional `access`, but more > importantly the additional `utime` calls are responsible in the > increased merge times that we see. Yeah, that makes sense from your tables. The commit in question flips the order of the loose/packed check, and the packed check should be much faster on your NFS mount. Can you try: diff --git a/sha1_file.c b/sha1_file.c index 88f06ba..822aaef 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -3014,7 +3014,7 @@ int write_sha1_file(const void *buf, unsigned long len, const char *type, unsign write_sha1_file_prepare(buf, len, type, sha1, hdr, &hdrlen); if (returnsha1) hashcpy(returnsha1, sha1); - if (freshen_loose_object(sha1) || freshen_packed_object(sha1)) + if (freshen_packed_object(sha1) || freshen_loose_object(sha1)) return 0; return write_loose_object(sha1, hdr, hdrlen, buf, len, 0); } I think that should clear up the access() calls, but leave the utime() ones. > Looking at the detailed strace shows that utime will be called > repeatedly in same cases (e.g. > https://bitbucket.org/snippets/ssaasen/oend shows an example where the > same packfile will be updated more than 4000 times in a single merge). > > http://www.spinics.net/lists/git/msg240106.html discusses a potential > improvement for this case. Would that be an acceptable avenue to > improve this situation? I think so. Here's a tentative patch: diff --git a/cache.h b/cache.h index 3d3244b..72c6888 100644 --- a/cache.h +++ b/cache.h @@ -1174,6 +1174,7 @@ extern struct packed_git { int pack_fd; unsigned pack_local:1, pack_keep:1, + freshened:1, do_not_close:1; unsigned char sha1[20]; /* something like ".git/objects/pack/xxxxx.pack" */ diff --git a/sha1_file.c b/sha1_file.c index 822aaef..f27cbf1 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2999,7 +2999,11 @@ static int freshen_loose_object(const unsigned char *sha1) static int freshen_packed_object(const unsigned char *sha1) { struct pack_entry e; - return find_pack_entry(sha1, &e) && freshen_file(e.p->pack_name); + if (!find_pack_entry(sha1, &e)) + return 0; + if (e.p->freshened) + return 1; + return freshen_file(e.p->pack_name); } int write_sha1_file(const void *buf, unsigned long len, const char *type, unsigned char *returnsha1) If it's not a problem, I'd love to see timings for your case with just the first patch, and then with both. You may also be interested in: http://thread.gmane.org/gmane.comp.version-control.git/266370 which addresses another performance problem related to the freshen/recent code in v2.2. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html