[PATCH/RFC v6 0/3] git checkout: optimise away lots of lstat() calls

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Changes since version 5:

- After comments from René Scharfe, the usage of the lstat_stat()
  function inside entry.c has got it's own wrapper,
  has_dirs_only_path().

- All wrapper functions inside cache.h is moved inside symlink.c, and
  updated to be normal C extern functions (not inline functions).
  Thanks, René!

- The clear_lstat_cache() function is deleted, and instead we do an
  safeguard if-test to detect changes to track_flags or
  prefix_len_stat_func.

- Introduce a 'static inline reset_lstat_cache()' function, and make
  an struct of all the cache variables.

- The patch is split up into 3 parts again.

That's all changes this time!  Then some questions:

* Inside commit message for commit c40641b77b0274186fd1b327d5dc3246f814aaaf
* ("Optimize symlink/directory detection") Linus Torvalds writes
| [...]
| This can - and should - probably be extended upon so that we
| eventually never do a bare 'lstat()' on any path entries at *all*
| when checking the index, but always check the full path
| carefully. Right now we do not generally check the whole path for
| all our normal quick index revalidation.

  I am not quite sure what Linus is thinking of here, and in
  particular what an (eventually) extended interface should look like.

  I have been thinking of maybe introduce a "git_lstat()" like wrapper
  with the same arguments as the lstat() function, and then when
  possible, retrieve the contents of 'struct stat' from the cache.
  Would this sort of "git_lstat()" like wrapper be useful?

  I think it then could be used to simplify the first 7 lines of the
  check_removed() function inside diff-lib.c, to in some cases be able
  to do 1 instead of 2 calls to lstat(), and maybe similar things at
  other places?

  Should the wrapper return an error if it is able to discover an
  symlink somewhere in the path, like has_symlink_leading_path()
  already does today?

  But, unless someone strongly ask for this "git_lstat()" like wrapper
  to be included inside this patch series, I guess this should be
  future work.

  I hope that the now 3 patches is at least one step in the right
  direction!

| We should also make sure that we're careful about all the
| invalidation, ie when we remove a link and replace it by a directory
| we should invalidate the symlink cache if it matches (and vice versa
| for the directory cache).

  This is an argument for the clear_lstat_cache() function, I guess.
        

-----

I have just started to clone some interesting Linux git trees to watch
the development more closely, and therefore also started to use git. I
noticed that 'git checkout' takes some time, and especially that the
'git checkout' command does lots and lots of lstat() calls.

After some more investigation and thinking, I have made 3 patches and
been able to optimise away over 40% of all lstat() calls in some cases
for the 'git checkout' command.  Also, if you use a large path to the
'--prefix' argument to the 'git checkout-index' command, and you have
lots of files, the savings can be really huge!

The 3 patches is against git master, and the git 'make test' test
suite still passes after each patch.  To document the improvement,
below is some numbers, which compares before and after the 3
patches. To reproduce the numbers:

- git clone the Linux git tree to be able to get the Linux tags
  'v2.6.25' and 'v2.6.27'.
- git checkout -b my-v2.6.27 v2.6.27
- git checkout -b my-v2.6.25 v2.6.25

Then, when the current branch is 'my-v2.6.25', do:

  strace -o strace_to27 -T git checkout -q my-v2.6.27

And then pretty print the 'strace_to27' file.  Below is the numbers
from the current git version (before the 3 patches).  Notice that we
do an lstat() call on the "arch" directory over 6000 times!

TOTAL      185151 100.000% OK:165544 NOT: 19607  11.136001 sec   60 usec/call
lstat64    120954  65.327% OK:107013 NOT: 13941   5.388727 sec   45 usec/call
  strings  120954 tot  30163 uniq   4.010 /uniq   5.388727 sec   45 usec/call
  files     61491 tot  28712 uniq   2.142 /uniq   2.740520 sec   45 usec/call
  dirs      45522 tot   1436 uniq  31.701 /uniq   1.994448 sec   44 usec/call
  errors    13941 tot   5189 uniq   2.687 /uniq   0.653759 sec   47 usec/call
             6297   5.206% OK:  6297 NOT:     0  "arch"
             4544   3.757% OK:  4544 NOT:     0  "drivers"
             1816   1.501% OK:  1816 NOT:     0  "arch/arm"
             1499   1.239% OK:  1499 NOT:     0  "include"
              912   0.754% OK:   912 NOT:     0  "arch/powerpc"
              764   0.632% OK:   764 NOT:     0  "fs"
              746   0.617% OK:   746 NOT:     0  "drivers/net"
              662   0.547% OK:   662 NOT:     0  "net"
              652   0.539% OK:   325 NOT:   327  "arch/sparc/include"
              636   0.526% OK:   636 NOT:     0  "drivers/media"
              606   0.501% OK:   606 NOT:     0  "include/linux"
              533   0.441% OK:   533 NOT:     0  "arch/sh"
              522   0.432% OK:   260 NOT:   262  "arch/powerpc/include"
              488   0.403% OK:   243 NOT:   245  "arch/sh/include"
              413   0.341% OK:   413 NOT:     0  "arch/sparc"
              390   0.322% OK:   390 NOT:     0  "arch/x86"
              383   0.317% OK:   383 NOT:     0  "Documentation"
              370   0.306% OK:   184 NOT:   186  "arch/ia64/include"
              366   0.303% OK:   366 NOT:     0  "drivers/media/video"
              348   0.288% OK:   173 NOT:   175  "arch/arm/include"

Here is the numbers after applying the 3 patches.  Notice how nice the
top 20 entries list now looks!

TOTAL      134155 100.000% OK:122102 NOT: 12053  11.069389 sec   83 usec/call
lstat64     69876  52.086% OK: 63491 NOT:  6385   3.410007 sec   49 usec/call
  strings   69876 tot  30163 uniq   2.317 /uniq   3.410007 sec   49 usec/call
  files     61491 tot  28712 uniq   2.142 /uniq   3.023238 sec   49 usec/call
  dirs       2000 tot   1436 uniq   1.393 /uniq   0.085953 sec   43 usec/call
  errors     6385 tot   5189 uniq   1.230 /uniq   0.300816 sec   47 usec/call
                4   0.006% OK:     4 NOT:     0  ".gitignore"
                4   0.006% OK:     4 NOT:     0  ".mailmap"
                4   0.006% OK:     4 NOT:     0  "CREDITS"
                4   0.006% OK:     4 NOT:     0  "Documentation/00-INDEX"
                4   0.006% OK:     4 NOT:     0  "Documentation/ABI/testing/sysfs-block"
                4   0.006% OK:     4 NOT:     0  "Documentation/ABI/testing/sysfs-firmware-acpi"
                4   0.006% OK:     4 NOT:     0  "Documentation/CodingStyle"
                4   0.006% OK:     4 NOT:     0  "Documentation/DMA-API.txt"
                4   0.006% OK:     4 NOT:     0  "Documentation/DMA-mapping.txt"
                4   0.006% OK:     4 NOT:     0  "Documentation/DocBook/Makefile"
                4   0.006% OK:     4 NOT:     0  "Documentation/DocBook/gadget.tmpl"
                4   0.006% OK:     4 NOT:     0  "Documentation/DocBook/kernel-api.tmpl"
                4   0.006% OK:     4 NOT:     0  "Documentation/DocBook/kernel-locking.tmpl"
                4   0.006% OK:     4 NOT:     0  "Documentation/DocBook/procfs-guide.tmpl"
                4   0.006% OK:     4 NOT:     0  "Documentation/DocBook/procfs_example.c"
                4   0.006% OK:     4 NOT:     0  "Documentation/DocBook/rapidio.tmpl"
                4   0.006% OK:     4 NOT:     0  "Documentation/DocBook/s390-drivers.tmpl"
                4   0.006% OK:     4 NOT:     0  "Documentation/DocBook/uio-howto.tmpl"
                4   0.006% OK:     4 NOT:     0  "Documentation/DocBook/videobook.tmpl"
                4   0.006% OK:     4 NOT:     0  "Documentation/DocBook/writing_usb_driver.tmpl"

Comments?


Kjetil Barvik (3):
  lstat_cache(): more cache effective symlink/directory detection
  lstat_cache(): introduce has_symlink_or_noent_leading_path() function
  lstat_cache(): introduce has_dirs_only_path() function

 cache.h        |    2 +
 entry.c        |   34 +++------
 symlinks.c     |  210 ++++++++++++++++++++++++++++++++++++++++++++-----------
 unpack-trees.c |    4 +-
 4 files changed, 183 insertions(+), 67 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux