On Fri, May 22, 2015 at 03:02:45PM +0000, steve.norman@xxxxxxxxxxxxxxxxxx wrote: > On Friday, May 22, 2015 @ 11:06 AM Duy Nguyen did write: > > > Strange. Maybe there is something else... Anyway some numbers from me. > > This is nfs3 hosted by Raspberry Pi, accessed over wireless. I just > > run index-pack on git.git pack instead of full clone. > > > > - v1.8.4.1 34s > > - v1.8.4.2 519s (oh.. wireless..) > > - v1.8.4.2 without has_sha1_file() in index-pack.c 33s > > - v1.8.4.2 + Jeff's mtime patch 36s > > Just ran the test again and it was 12 seconds. Too many versions of git available on the > machine and I think I might have run the wrong one: Thanks for re-checking. Here's a series that fixes the rough edges of the patch I sent earlier. I'd appreciate it if you can re-confirm that it improves things for you. [1/3]: stat_validity: handle non-regular files [2/3]: cache.h: move stat_validity definition up [3/3]: prepare_packed_git: use stat_validity to avoid re-reading packs I'm adding Michael to the cc, as I'm abusing the stat_validity code which we worked on in 2013. My big concern here is that using stat_validity with a directory is racy. It works for a regular file like packed-refs because that file is replaced atomically. We fill the validity using fstat() on the same descriptor we read the data from, and nobody modifies an already-written file. So we know it matches what we read. For a directory, I don't think that atomicity guarantee is there. Somebody can modify the direct while we're reading. For the most part, I think that is OK. We fstat() before reading any entries, so our fstat data will then become out of date, and we'll re-read next time. It would be a problem if opendir() looked at the entries at all (e.g., if it called getdents() under Linux before our first readdir() call, then our fstat is happening after the first read, and wouldn't match what we read). I don't have any reason to believe that any libc does that, but it is making an assumption about how opendir() is implemented. The other problem is that I'm not sure stat data is enough to notice when a directory changes. Certainly the mtime should change, but if you have only one-second resolution on your mtimes, we can be fooled. For a regular file that is replaced atomically, the inode will change (and probably the size, too). But I don't know if that is the case for a directory. Can writing an entry go undetected between two stat() calls (or something even more pathological, like deleting one entry and writing another one)? -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html