On Mon, Oct 29 2018, Jeff King wrote: > On Mon, Oct 29, 2018 at 09:48:02AM +0900, Junio C Hamano wrote: > >> > Of course any cache raises questions of cache invalidation, but I think >> > we've already dealt with that for this case. When we use >> > OBJECT_INFO_QUICK, that is a sign that we want to make this kind of >> > accuracy/speed tradeoff (which does a similar caching thing with >> > packfiles). >> > >> > So putting that all together, could we have something like: >> >> I think this conceptually is a vast improvement relative to >> ".cloning" optimization. Obviously this does not have the huge >> downside of the other approach that turns the collision detection >> completely off. >> >> A real question is how much performance gain, relative to ".cloning" >> thing, this approach gives us. If it gives us 80% or more of the >> gain compared to doing no checking, I'd say we have a clear winner. > > My test runs showed it improving index-pack by about 3%, versus 4% for > no collision checking at all. But there was easily 1% of noise. And much > more importantly, that was on a Linux system on ext4, where stat is > fast. I'd be much more curious to hear timing results from people on > macOS or Windows, or from Geert's original NFS case. At work we make copious use of NetApp over NFS for filers. I'd say this is probably typical for enterprise environments. Raw I/O performance over the wire (writing a large file) is really good, but metadata (e.g. stat) performance tends to be atrocious. We both host the in-house Git server (GitLab) on such a filer (for HA etc.), as well as many types of clients. As noted by Geert upthread you need to mount the git directories with lookupcache=positive (see e.g. [1]). Cloning git.git as --bare onto such a partition with my patch: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 60.98 1.802091 19 93896 19813 futex 14.64 0.432782 7 61415 16 read 9.40 0.277804 1 199576 pread64 4.88 0.144172 3 49355 11 write 3.10 0.091498 31 2919 2880 stat 2.53 0.074812 31 2431 737 lstat 1.96 0.057934 3 17257 1276 recvfrom 0.91 0.026815 3 8543 select 0.62 0.018425 2 8543 poll [...] real 0m32.053s user 0m21.451s sys 0m7.806s Without: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 71.01 31.653787 50 628265 21608 futex 24.14 10.761950 41 260658 258964 lstat 2.22 0.988001 5 199576 pread64 1.32 0.587844 10 59662 3 read 0.79 0.350625 7 50376 11 write 0.22 0.096019 33 2919 2880 stat 0.13 0.057950 4 15821 12 recvfrom 0.05 0.022385 3 7949 select 0.04 0.015988 2 7949 poll 0.03 0.013622 3406 4 wait4 [...] real 4m38.670s user 0m29.015s sys 0m33.894s So a reduction in clone time by ~90%. Performance would be basically the same with your patch. But let's discuss that elsewhere in this thread. Just wanted to post the performance numbers here. 1. https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/109#note_12528896