[BUG] Performance regression due to #33d4221: write_sha1_file: freshen existing objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We became aware of slow merge times with the following setup:

The merge is created in a temporary location that uses alternates. The
temporary repository is on a local disk, the alternate object database
on an NFS mount.

After some investigation we believe that #33d4221 (present in git
2.2.0, absent in 2.1.4) is causing this regression in merge time.

The following are merge times (in seconds) with git@33d4221~
(2.1.2-393-gabcb865) (before the change)

      Elapsed         System            User
 Min.   :0.3700   Min.   :0.04000   Min.   :0.3000
 1st Qu.:0.3800   1st Qu.:0.05000   1st Qu.:0.3100
 Median :0.4000   Median :0.06000   Median :0.3300
 Mean   :0.4295   Mean   :0.05905   Mean   :0.3519
 3rd Qu.:0.4600   3rd Qu.:0.07000   3rd Qu.:0.3600
 Max.   :0.5900   Max.   :0.09000   Max.   :0.4900


The following are merge times with git@33d4221 (2.1.2-394-g33d4221):

      Elapsed         System            User
 Min.   : 8.58   Min.   :1.46   Min.   :0.4400
 1st Qu.: 9.63   1st Qu.:1.60   1st Qu.:0.4400
 Median :10.64   Median :1.66   Median :0.4800
 Mean   :10.50   Mean   :1.69   Mean   :0.4986
 3rd Qu.:11.13   3rd Qu.:1.81   3rd Qu.:0.5000
 Max.   :13.96   Max.   :2.05   Max.   :0.6700


As you can see the merge times are an order of magnitude slower after
the change.

The effect of  #33d4221 can be seen when strace'ing the merge:

Running strace on git@33d4221 yields
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 70.79    0.714852         178      4018           utime
 14.73    0.148789           3     50141     50123 lstat
 13.88    0.140198          17      8074      8067 access
  0.24    0.002455         614         4           rename
  0.15    0.001493           3       577           write
  0.06    0.000618          10        65           close
  0.04    0.000453           3       152           brk
  0.04    0.000433          27        16           mkdir
  0.03    0.000310           8        41           fstat


Running strace on git@33d4221~ yields

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 98.37    0.138516           3     50141     50123 lstat
  0.92    0.001292           2       577           write
  0.37    0.000520          14        38        31 access
  0.18    0.000252          36         7           getcwd
  0.17    0.000237           7        36        20 stat
  0.00    0.000000           0        40           read
  0.00    0.000000           0        87        30 open


My current hypothesis is that the additional `access`, but more
importantly the additional `utime` calls are responsible in the
increased merge times that we see.
NFS stats on the server for the tests seem to confirm this (see
nfsstat-{after,before}-change.txt on
https://bitbucket.org/snippets/ssaasen/oend).
This is certainly due to the fact that this will all happen over NFS
but in 2.1.4 this worked fine and starting with 2.2 this has become
very slow.

Looking at the detailed strace shows that utime will be called
repeatedly in same cases (e.g.
https://bitbucket.org/snippets/ssaasen/oend shows an example where the
same packfile will be updated more than 4000 times in a single merge).

http://www.spinics.net/lists/git/msg240106.html discusses a potential
improvement for this case. Would that be an acceptable avenue to
improve this situation?

Best regards,
Stefan Saasen
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]