Re: [PATCH/RFC v3 7/9] write_entry(): use fstat() instead of lstat() when file is open

Kjetil Barvik <barvik@xxxxxxxxxxxx> · Fri, 06 Feb 2009 12:06:57 +0100

* On Thu, 5 Feb 2009, Johannes Sixt wrote:
|
|> Kjetil Barvik schrieb:
|> >   And, yes, since each lstat() call cost approximately 44 microseconds
|> >   compared to 12-16 for each lstat() on my Linux box, there was a little
|>                                ^^^^^^^ fstat()?
|> >   performance gain from this patch.
|> 
|> This does look like a good gain. But do you have hard numbers that back
|> the claim?

  __________________
  Test description

  I have used the following 8 git binaries:

$ for g in ./git_t*; do printf "$g => $($g --version)\n"; done
./git_t0 => git version 1.6.1.80.g7eb5
./git_t1 => git version 1.6.1.85.gbda6       <= added lstat_cache()
./git_t2 => git version 1.6.1.2.306.gc0f6f
./git_t3 => git version 1.6.1.2.307.g55385
./git_t4 => git version 1.6.1.2.308.g052a
./git_t5 => git version 1.6.1.2.310.g40dd2   <= added schedule_dir_for_removal()
./git_t6 => git version 1.6.1.2.313.g9deee
./git_t7 => git version 1.6.1.2.314.g0ad9    <= added this patch (7/9)

  Except for git_t7 all commits should be in origin/pu, so people should
  be able to do git show/diff/log on the last hex chars on the version-
  strings to see the differences.

  CFLAGS="-mtune=core2 -O2 -fomit-frame-pointer -fno-stack-protector -g0 -s"

  Each git_t* binary is run with args "checkout -q my-v.2.6.2[57]" for a
  total of 100 times (50/50 to my-v2.6.25/my-v2.6.27).  Before each run
  the test script sleeps 20 seconds to allow the disk to finish and
  being idle.  Time is collected by /usr/bin/time -o output-file prog.

  While the test script was run, the laptop with an Intel Core2 Duo 2
  GHz processor, was run without X and was otherwise idle.  The test
  script took 9 hours and 45 minutes to complete.

  __________________
  Test numbers

$ for ((t=0; $t<=7; t++)); do echo "git_t$t => $(parse_usr_bin_time_lines.pl git_t$t*)"; done
git_t0 =>  5.646 user  8.322 sys  25.579 real  54.6% CPU  faults:  0 major 39587 minor
git_t1 =>  5.631 user  6.826 sys  23.941 real  52.1% CPU  faults:  0 major 39901 minor
git_t2 =>  5.622 user  6.854 sys  24.036 real  52.1% CPU  faults:  0 major 39298 minor
git_t3 =>  5.636 user  6.867 sys  24.088 real  52.0% CPU  faults:  0 major 39786 minor
git_t4 =>  5.640 user  6.818 sys  24.006 real  52.0% CPU  faults:  0 major 39629 minor
git_t5 =>  5.642 user  6.552 sys  23.805 real  51.4% CPU  faults:  0 major 39707 minor
git_t6 =>  5.629 user  6.518 sys  22.981 real  53.0% CPU  faults:  0 major 40029 minor
git_t7 =>  5.629 user  6.051 sys  23.013 real  50.9% CPU  faults:  0 major 39451 minor

|> If you can squeeze out a 10% improvement on Linux with this patch, we
|> should take it, and if it turns out that it doesn't work on Windows, we
|> could trivially throw in an #ifdef MINGW (or even #ifdef WIN32 if Cygwin
|> is affected, too) that skips the fstat() optimization.

  For the system used time, the improvement was (- 6.518 6.051) = 0.467
  seconds, or (/ (* (- 6.518 6.051) 100.0) 6.518) = 7.2%, so not 10%.

  Funny to see that in:
     http://article.gmane.org/gmane.comp.version-control.git/107281

  I guessed the improvement to be (quote):
     "(* 14403 (- 44 12)) = 460 896 microseconds system time"

  So I missed only by a little over 6ms. :-)

* Johannes Schindelin <Johannes.Schindelin@xxxxxx> writes:
| Of course, what we _really_ would do is to provide a flag, say, 
| FSTAT_UNRELIABLE and test for _that_ (after defining it in the Makefile 
| for the appropriate platforms).

  Or, maybe

     #define FSTAT_RELIABLE 1

  for Linux only?  Then we can change the if-test inside this patch to
  the following:

-  if (state->refresh_cache && !to_tempfile && !state->base_dir_len) {
+  if (state->refresh_cache && !to_tempfile && !state->base_dir_len && 
+      FSTAT_RELIABLE) {

  Then we do not have to have #if-defines inside the source code, only
  in one header file.

  But, question: is this patch worth the extra lines added to the source
  code?

  -- kjetil

  PS!  Junio, I think this patch series should be inside pu some days
       more, since things obviously needs more refinement and tests.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html