Hi Tao, On Tue, 22 Sep 2020, Tao Klerks wrote: > I've got a couple questions about the "fsmonitor" functionality, > untracked files, and multithreading. > > Background: > > In a repo with: > * A couple hundred thousand tracked files, and a couple hundred > thousand .gitignored files, across a few thousand directories > * The --untracked-cache setting, tested and working > * core.fsmonitor set up with watchman (with the sample integration > script from january) > * Git version 2.27.0.windows.1 > > "git status" takes about 2s > "git status --untracked-files=all" takes about 20s > > When I turn off "core.fsmonitor", the numbers change to something like: > "git status": 8s > "git status --untracked-files=all": 9s > > Using windows' "procmon" to observe git.exe's behavior from outside, I > think I've understood a couple things that surprise me: > 1. when you specify "--untracked-files=all", git scans the entire > folder tree regardless of the "fsmonitor" hook > 2. when you specify the "fsmonitor" hook, git does any > filesystem-scanning in a single-threaded fashion (as opposed to > multi-threaded without "fsmonitor" / normally) > > These two things combine so that with "fsmonitor" set, normal > command-line git status performance is great, but the performance in > tools that eagerly look for untracked files (like "Git Extensions" on > windows) actually suffers - it takes twice as long to run the 'git -c > diff.ignoreSubModules=none status --porcelain=2 -z > --untracked-files=all' command that this UI wants (and blocks on, when > you go to a commit dialog). > > Questions: > > 1. Is there a reason "--untracked-files=all" causes a full directory > tree scan even with the "fsmonitor" hook active, or is this > accidental? I have a hunch that this might be related to a performance hack we have in Git for Windows: did you enable FSCache perchance? If so, I _suspect_ that turning it off would accelerate `git status --untracked-files=all`. Ciao, Johannes > 2. Assuming that the full directory tree scan is indeed necessary even > with "fsmonitor" (when requesting all untracked files), could it be > made multithreaded? > > (my apologies for the simplistic "outside-in" observations; I don't > feel qualified to attempt to understand the git source code) > > Thanks for any help understanding the optimization opportunities here! > > Tao Klerks >