Re: [PATCH v4 0/6] Fast git status via a file system watcher

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 1, 2017 at 5:50 PM, Ben Peart <peartben@xxxxxxxxx> wrote:
> Changes from V3 include:
>  - update test script based on feedback
>  - update template hook proc with better post-processing code and make
>    it executable

I have watchman running finally, so aside from issues applying this I
can finally test this. I set up a watch of linux.git:

$ watchman watch $PWD
$ watchman watch-list|jq '.roots[]'
"/home/avar/g/linux"

And first, for simplicity, I'll be turning core.fsmonitore on later:

$ for c in fsmonitor untrackedCache splitIndex; do git config core.$c
false; done

On a hot fs cache running "git status" takes me 80ms, but if I flush caches:

# free && sync && echo 3 > /proc/sys/vm/drop_caches && free

Running "git status" takes me 4s. Now, right after flushing those caches:

$ date +%s
1496349235
$ time (echo '["query", "/home/avar/g/linux", {"since": 1496349235,
"fields":["name"]}]' | watchman -j) | jq '.files[]'

real    0m0.664s
user    0m0.000s
sys     0m0.004s
$ touch foo bar
$ time (echo '["query", "/home/avar/g/linux", {"since": 1496349235,
"fields":["name"]}]' | watchman -j) | jq '.files[]'
"bar"
"foo"

real    0m0.044s
user    0m0.000s
sys     0m0.000s

Without the monitor running "git status' takes me ~2.5s, i.e. from cold cache:

$ time GIT_TRACE=1 ~/g/git/git-status
20:34:49.154731 git.c:369               trace: built-in: git 'status'
On branch master
Your branch is up-to-date with 'origin/master'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)

        bar
        foo

nothing added to commit but untracked files present (use "git add" to track)

real    0m2.546s
user    0m0.060s
sys     0m0.228s

Now, I would expect the case where the working tree isn't in the fs
cache to be lightning fast, since the watchman command is really fast
(flushed the cache again, turned on the fsmonitor):

$ date +%s
1496349523
$ time (echo '["query", "/home/avar/g/linux", {"since": 1496349523,
"fields":["name"]}]' | watchman -j) | jq '.files[]'

real    0m0.529s
user    0m0.000s
sys     0m0.004s
$ touch foo bar
$ time (echo '["query", "/home/avar/g/linux", {"since": 1496349523,
"fields":["name"]}]' | watchman -j) | jq '.files[]'
"bar"
"foo"

real    0m0.312s
user    0m0.000s
sys     0m0.000s

But if I run "git status" (and I instrumented the hook to dump the
changed files) it takes a long time (those 10s are just the disk being
slow, but it should be ~0s, right?):

$ time GIT_TRACE=1 ~/g/git/git-status
20:39:11.250128 git.c:369               trace: built-in: git 'status'
20:39:14.586720 run-command.c:626       trace: run_command:
'.git/hooks/query-fsmonitor' '1' '1496349494197821000'
Watchman says these changed: bar foo hi there .git .git/index.lock .git/index
On branch master
Your branch is up-to-date with 'origin/master'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)

        bar
        foo

nothing added to commit but untracked files present (use "git add" to track)

real    0m10.791s
user    0m0.108s
sys     0m0.228s

If I re-run that it takes ~160-200ms with the hook, 80-120ms without
it, so in this case once the data is in the fscache watchman is
actually slower.

But manually running watchman shows that it's really fast, usually
returning within 3ms, I flush the fscache in the middle of this,
that's the 600ms time, that could be some global kernel lock or
something due to the odd manual proc operation, I haven't actually
tested this on a system under memory pressure:

$ for i in {1..10}; do (time (printf '["query", "/home/avar/g/linux",
{"since": %s, "fields":["name"]}]' $(($(date +%s)-3)) | watchman -j |
jq '.files[]')) 2>&1 | grep -e '"' -e ^real && touch $i && sleep 1;
done
real    0m0.004s
"1"
real    0m0.003s
"2"
"1"
real    0m0.605s
"3"
real    0m0.003s
"4"
"3"
real    0m0.003s
"5"
"4"
real    0m0.003s
"6"
"5"
real    0m0.003s
"7"
"6"
real    0m0.003s
"8"
"7"
real    0m0.003s
"9"
"8"
real    0m0.003s

So something really odd is going on here. This should be speeding up
"git status" a lot, even with a hot fs cache doing stat on all of
linux.git takes a lot longer than 3ms, but for some reason it's
slower.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]