Re: [PATCH v4 0/6] Fast git status via a file system watcher

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Any chance you can provide me with a bash script that contains the exact sequence of commands you are running to get this result? I've been trying to replicate it using your notes but have not been able to. I'd like to see if it is a repo difference, a platform difference, a command sequence difference (or something else entirely :)).

Thanks,

Ben

On 6/1/2017 5:13 PM, Ævar Arnfjörð Bjarmason wrote:
On Thu, Jun 1, 2017 at 10:51 PM, Ævar Arnfjörð Bjarmason
<avarab@xxxxxxxxx> wrote:
On Thu, Jun 1, 2017 at 5:50 PM, Ben Peart <peartben@xxxxxxxxx> wrote:
Changes from V3 include:
  - update test script based on feedback
  - update template hook proc with better post-processing code and make
    it executable

I have watchman running finally, so aside from issues applying this I
can finally test this. I set up a watch of linux.git:

$ watchman watch $PWD
$ watchman watch-list|jq '.roots[]'
"/home/avar/g/linux"

And first, for simplicity, I'll be turning core.fsmonitore on later:

$ for c in fsmonitor untrackedCache splitIndex; do git config core.$c
false; done

On a hot fs cache running "git status" takes me 80ms, but if I flush caches:

# free && sync && echo 3 > /proc/sys/vm/drop_caches && free

Running "git status" takes me 4s. Now, right after flushing those caches:

$ date +%s
1496349235
$ time (echo '["query", "/home/avar/g/linux", {"since": 1496349235,
"fields":["name"]}]' | watchman -j) | jq '.files[]'

real    0m0.664s
user    0m0.000s
sys     0m0.004s
$ touch foo bar
$ time (echo '["query", "/home/avar/g/linux", {"since": 1496349235,
"fields":["name"]}]' | watchman -j) | jq '.files[]'
"bar"
"foo"

real    0m0.044s
user    0m0.000s
sys     0m0.000s

Without the monitor running "git status' takes me ~2.5s, i.e. from cold cache:

$ time GIT_TRACE=1 ~/g/git/git-status
20:34:49.154731 git.c:369               trace: built-in: git 'status'
On branch master
Your branch is up-to-date with 'origin/master'.

Untracked files:
   (use "git add <file>..." to include in what will be committed)

         bar
         foo

nothing added to commit but untracked files present (use "git add" to track)

real    0m2.546s
user    0m0.060s
sys     0m0.228s

Now, I would expect the case where the working tree isn't in the fs
cache to be lightning fast, since the watchman command is really fast
(flushed the cache again, turned on the fsmonitor):

$ date +%s
1496349523
$ time (echo '["query", "/home/avar/g/linux", {"since": 1496349523,
"fields":["name"]}]' | watchman -j) | jq '.files[]'

real    0m0.529s
user    0m0.000s
sys     0m0.004s
$ touch foo bar
$ time (echo '["query", "/home/avar/g/linux", {"since": 1496349523,
"fields":["name"]}]' | watchman -j) | jq '.files[]'
"bar"
"foo"

real    0m0.312s
user    0m0.000s
sys     0m0.000s

But if I run "git status" (and I instrumented the hook to dump the
changed files) it takes a long time (those 10s are just the disk being
slow, but it should be ~0s, right?):

$ time GIT_TRACE=1 ~/g/git/git-status
20:39:11.250128 git.c:369               trace: built-in: git 'status'
20:39:14.586720 run-command.c:626       trace: run_command:
'.git/hooks/query-fsmonitor' '1' '1496349494197821000'
Watchman says these changed: bar foo hi there .git .git/index.lock .git/index
On branch master
Your branch is up-to-date with 'origin/master'.

Untracked files:
   (use "git add <file>..." to include in what will be committed)

         bar
         foo

nothing added to commit but untracked files present (use "git add" to track)

real    0m10.791s
user    0m0.108s
sys     0m0.228s

If I re-run that it takes ~160-200ms with the hook, 80-120ms without
it, so in this case once the data is in the fscache watchman is
actually slower.

But manually running watchman shows that it's really fast, usually
returning within 3ms, I flush the fscache in the middle of this,
that's the 600ms time, that could be some global kernel lock or
something due to the odd manual proc operation, I haven't actually
tested this on a system under memory pressure:

$ for i in {1..10}; do (time (printf '["query", "/home/avar/g/linux",
{"since": %s, "fields":["name"]}]' $(($(date +%s)-3)) | watchman -j |
jq '.files[]')) 2>&1 | grep -e '"' -e ^real && touch $i && sleep 1;
done
real    0m0.004s
"1"
real    0m0.003s
"2"
"1"
real    0m0.605s
"3"
real    0m0.003s
"4"
"3"
real    0m0.003s
"5"
"4"
real    0m0.003s
"6"
"5"
real    0m0.003s
"7"
"6"
real    0m0.003s
"8"
"7"
real    0m0.003s
"9"
"8"
real    0m0.003s

So something really odd is going on here. This should be speeding up
"git status" a lot, even with a hot fs cache doing stat on all of
linux.git takes a lot longer than 3ms, but for some reason it's
slower.

Dug a bit further, with cold cache and no fsmonitor, gprof output from
-O0 compiled git:

Each sample counts as 0.01 seconds.
   %   cumulative   self              self     total
  time   seconds   seconds    calls  ms/call  ms/call  name
  37.50      0.03     0.03   123886     0.00     0.00  memihash
  12.50      0.04     0.01   284727     0.00     0.00  match_basename
  12.50      0.05     0.01    73989     0.00     0.00  match_pathname
  12.50      0.06     0.01    59924     0.00     0.00  hashmap_add
  12.50      0.07     0.01    59847     0.00     0.00  same_name
  12.50      0.08     0.01    59844     0.00     0.00  hash_index_entry
   0.00      0.08     0.00   598446     0.00     0.00  git_bswap32
   0.00      0.08     0.00   259150     0.00     0.00  fspathncmp
   0.00      0.08     0.00   178731     0.00     0.00  match_pathspec
   0.00      0.08     0.00   178655     0.00     0.00  do_match_pathspec
And then cold cache with fsmonitor:

Each sample counts as 0.01 seconds.
   %   cumulative   self              self     total
  time   seconds   seconds    calls  ms/call  ms/call  name
  66.67      0.04     0.04    96696     0.00     0.00  blk_SHA1_Block
  16.67      0.05     0.01    59844     0.00     0.00  cache_entry_from_ondisk
  16.67      0.06     0.01    58869     0.00     0.00  longest_path_match
   0.00      0.06     0.00  1547157     0.00     0.00  git_bswap32
   0.00      0.06     0.00  1196893     0.00     0.00  git_bswap32
   0.00      0.06     0.00   284727     0.00     0.00  match_basename
   0.00      0.06     0.00   259150     0.00     0.00  fspathncmp
   0.00      0.06     0.00   178561     0.00     0.00  do_match_pathspec
   0.00      0.06     0.00   178543     0.00     0.00  match_pathspec
   0.00      0.06     0.00   145141     0.00     0.00  memory_limit_check




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]