On Sat, Oct 17, 2020 at 10:02:04PM -0700, Junio C Hamano wrote: > Taylor Blau <me@xxxxxxxxxxxx> writes: > > > Hmm. I do agree that I'd like to stay out of the business of trying to > > figure out exactly what that trade-off is (although I'm sure that it > > exists), only because it seems likely to vary to a large extent from > > repository to repository. (That is, 20% may be a good number for some > > repository, but a terrible choice for another). > > I think both of you misunderstood me. > > My question was a simple yes/no "does there a trade off exist?" > question and the sentences with 20% in it were mere example of > possible trade-off I had in mind that _could_ exist. I wasn't even > suggesting to figure out what the optimum cut-off heuristics would > be (e.g. solving "when more than N% paths are subject to diff > fsmonitor is faster" for N). > > I was hoping that we can show that even having to lstat just a > single path is expensive enough---IOW, "there is no trade-off worth > worrying about, because talking to fsmonitor is so cheap compared to > the cost of even a single lstst" would have been a valid and happy > answer. With such a number, there is no risk of introducing an > unwarranted performance regression to use cases that we did not > anticipate by adding an unconditional call to refresh_fsmonitor(). > > But without any rationale, the performance implication of adding an > unconditional call to refresh_fsmonitor() would become much muddier. Aha; thanks for clarifying. I'm glad we agree that finding 'N' would not be worth it, or at least that showing that talking to fsmonitor is cheaper than a single lstat would be more worthwhile. Nipunn - I don't have fsmonitor/watchman setup on my workstation, but if you do, some numbers (or an interpretation of the numbers you already provided) on this would be really useful. If you don't have it set up, or don't have time to measure it, let me know, and I'd be happy to take a look. > > But, I think that we can invoke watchman better here; the > > fsmonitor-watchman hook has no notion of a "pathspec", so every query > > just asks for everything that isn't in '$GIT_DIR'. Is there anything > > preventing us from taking an optional pathspec and building up a more > > targeted query? > > Yup, it is what I had in mind when I brought up the pathspec. It > may be something worth pursuing longer term, but not within the > scope of this patch. > > > There is some overhead to invoke the hook and talk to watchman, but > > I'd expect that to be dwarfed by not having to issue O(# files) > > syscalls. > > "invoke the hook"---is that a pipe+fork+exec, or something else that > is far lighter-weight? The former; see 'fsmonitor.c:query_fsmonitor()'. Thanks, Taylor