On Thu, Jun 1, 2017 at 10:51 PM, Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote: > On Thu, Jun 1, 2017 at 5:50 PM, Ben Peart <peartben@xxxxxxxxx> wrote: >> Changes from V3 include: >> - update test script based on feedback >> - update template hook proc with better post-processing code and make >> it executable > > I have watchman running finally, so aside from issues applying this I > can finally test this. I set up a watch of linux.git: > > $ watchman watch $PWD > $ watchman watch-list|jq '.roots[]' > "/home/avar/g/linux" > > And first, for simplicity, I'll be turning core.fsmonitore on later: > > $ for c in fsmonitor untrackedCache splitIndex; do git config core.$c > false; done > > On a hot fs cache running "git status" takes me 80ms, but if I flush caches: > > # free && sync && echo 3 > /proc/sys/vm/drop_caches && free > > Running "git status" takes me 4s. Now, right after flushing those caches: > > $ date +%s > 1496349235 > $ time (echo '["query", "/home/avar/g/linux", {"since": 1496349235, > "fields":["name"]}]' | watchman -j) | jq '.files[]' > > real 0m0.664s > user 0m0.000s > sys 0m0.004s > $ touch foo bar > $ time (echo '["query", "/home/avar/g/linux", {"since": 1496349235, > "fields":["name"]}]' | watchman -j) | jq '.files[]' > "bar" > "foo" > > real 0m0.044s > user 0m0.000s > sys 0m0.000s > > Without the monitor running "git status' takes me ~2.5s, i.e. from cold cache: > > $ time GIT_TRACE=1 ~/g/git/git-status > 20:34:49.154731 git.c:369 trace: built-in: git 'status' > On branch master > Your branch is up-to-date with 'origin/master'. > > Untracked files: > (use "git add <file>..." to include in what will be committed) > > bar > foo > > nothing added to commit but untracked files present (use "git add" to track) > > real 0m2.546s > user 0m0.060s > sys 0m0.228s > > Now, I would expect the case where the working tree isn't in the fs > cache to be lightning fast, since the watchman command is really fast > (flushed the cache again, turned on the fsmonitor): > > $ date +%s > 1496349523 > $ time (echo '["query", "/home/avar/g/linux", {"since": 1496349523, > "fields":["name"]}]' | watchman -j) | jq '.files[]' > > real 0m0.529s > user 0m0.000s > sys 0m0.004s > $ touch foo bar > $ time (echo '["query", "/home/avar/g/linux", {"since": 1496349523, > "fields":["name"]}]' | watchman -j) | jq '.files[]' > "bar" > "foo" > > real 0m0.312s > user 0m0.000s > sys 0m0.000s > > But if I run "git status" (and I instrumented the hook to dump the > changed files) it takes a long time (those 10s are just the disk being > slow, but it should be ~0s, right?): > > $ time GIT_TRACE=1 ~/g/git/git-status > 20:39:11.250128 git.c:369 trace: built-in: git 'status' > 20:39:14.586720 run-command.c:626 trace: run_command: > '.git/hooks/query-fsmonitor' '1' '1496349494197821000' > Watchman says these changed: bar foo hi there .git .git/index.lock .git/index > On branch master > Your branch is up-to-date with 'origin/master'. > > Untracked files: > (use "git add <file>..." to include in what will be committed) > > bar > foo > > nothing added to commit but untracked files present (use "git add" to track) > > real 0m10.791s > user 0m0.108s > sys 0m0.228s > > If I re-run that it takes ~160-200ms with the hook, 80-120ms without > it, so in this case once the data is in the fscache watchman is > actually slower. > > But manually running watchman shows that it's really fast, usually > returning within 3ms, I flush the fscache in the middle of this, > that's the 600ms time, that could be some global kernel lock or > something due to the odd manual proc operation, I haven't actually > tested this on a system under memory pressure: > > $ for i in {1..10}; do (time (printf '["query", "/home/avar/g/linux", > {"since": %s, "fields":["name"]}]' $(($(date +%s)-3)) | watchman -j | > jq '.files[]')) 2>&1 | grep -e '"' -e ^real && touch $i && sleep 1; > done > real 0m0.004s > "1" > real 0m0.003s > "2" > "1" > real 0m0.605s > "3" > real 0m0.003s > "4" > "3" > real 0m0.003s > "5" > "4" > real 0m0.003s > "6" > "5" > real 0m0.003s > "7" > "6" > real 0m0.003s > "8" > "7" > real 0m0.003s > "9" > "8" > real 0m0.003s > > So something really odd is going on here. This should be speeding up > "git status" a lot, even with a hot fs cache doing stat on all of > linux.git takes a lot longer than 3ms, but for some reason it's > slower. Dug a bit further, with cold cache and no fsmonitor, gprof output from -O0 compiled git: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 37.50 0.03 0.03 123886 0.00 0.00 memihash 12.50 0.04 0.01 284727 0.00 0.00 match_basename 12.50 0.05 0.01 73989 0.00 0.00 match_pathname 12.50 0.06 0.01 59924 0.00 0.00 hashmap_add 12.50 0.07 0.01 59847 0.00 0.00 same_name 12.50 0.08 0.01 59844 0.00 0.00 hash_index_entry 0.00 0.08 0.00 598446 0.00 0.00 git_bswap32 0.00 0.08 0.00 259150 0.00 0.00 fspathncmp 0.00 0.08 0.00 178731 0.00 0.00 match_pathspec 0.00 0.08 0.00 178655 0.00 0.00 do_match_pathspec And then cold cache with fsmonitor: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 66.67 0.04 0.04 96696 0.00 0.00 blk_SHA1_Block 16.67 0.05 0.01 59844 0.00 0.00 cache_entry_from_ondisk 16.67 0.06 0.01 58869 0.00 0.00 longest_path_match 0.00 0.06 0.00 1547157 0.00 0.00 git_bswap32 0.00 0.06 0.00 1196893 0.00 0.00 git_bswap32 0.00 0.06 0.00 284727 0.00 0.00 match_basename 0.00 0.06 0.00 259150 0.00 0.00 fspathncmp 0.00 0.06 0.00 178561 0.00 0.00 do_match_pathspec 0.00 0.06 0.00 178543 0.00 0.00 match_pathspec 0.00 0.06 0.00 145141 0.00 0.00 memory_limit_check