Re: Question about fsmonitor and --untracked-files=all

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Tao,

On Tue, 22 Sep 2020, Tao Klerks wrote:

> I've got a couple questions about the "fsmonitor" functionality,
> untracked files, and multithreading.
>
> Background:
>
> In a repo with:
>  * A couple hundred thousand tracked files, and a couple hundred
> thousand .gitignored files, across a few thousand directories
>  * The --untracked-cache setting, tested and working
>  * core.fsmonitor set up with watchman (with the sample integration
> script from january)
>  * Git version 2.27.0.windows.1
>
> "git status" takes about 2s
> "git status --untracked-files=all" takes about 20s
>
> When I turn off "core.fsmonitor", the numbers change to something like:
> "git status": 8s
> "git status --untracked-files=all": 9s
>
> Using windows' "procmon" to observe git.exe's behavior from outside, I
> think I've understood a couple things that surprise me:
> 1. when you specify "--untracked-files=all", git scans the entire
> folder tree regardless of the "fsmonitor" hook
> 2. when you specify the "fsmonitor" hook, git does any
> filesystem-scanning in a single-threaded fashion (as opposed to
> multi-threaded without "fsmonitor" / normally)
>
> These two things combine so that with "fsmonitor" set, normal
> command-line git status performance is great, but the performance in
> tools that eagerly look for untracked files (like "Git Extensions" on
> windows) actually suffers - it takes twice as long to run the 'git -c
> diff.ignoreSubModules=none status --porcelain=2 -z
> --untracked-files=all' command that this UI wants (and blocks on, when
> you go to a commit dialog).
>
> Questions:
>
> 1. Is there a reason "--untracked-files=all" causes a full directory
> tree scan even with the "fsmonitor" hook active, or is this
> accidental?

I have a hunch that this might be related to a performance hack we have in
Git for Windows: did you enable FSCache perchance?

If so, I _suspect_ that turning it off would accelerate `git status
--untracked-files=all`.

Ciao,
Johannes

> 2. Assuming that the full directory tree scan is indeed necessary even
> with "fsmonitor" (when requesting all untracked files), could it be
> made multithreaded?
>
> (my apologies for the simplistic "outside-in" observations; I don't
> feel qualified to attempt to understand the git source code)
>
> Thanks for any help understanding the optimization opportunities here!
>
> Tao Klerks
>




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux