Thanks for the review. On Sun, Oct 27, 2019 at 8:37 PM Junio C Hamano <gitster@xxxxxxxxx> wrote: > > "Utsav Shah via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > > > From: Utsav Shah <utsav@xxxxxxxxxxx> > > > > git stash runs git reset --hard as its final step, which can be fairly slow on a large repository. > > This change lets us skip that if fsmonitor thinks those files aren't modified. > > > > git stash goes from ~8s -> 2s on my repository after this change. > > Please line-wrap overlong lines. > > More importantly, "stash" may be a mere symptom that does not > deserve this much emphasis. What you improved directly is "git > reset --hard" isn't it? > > The fsmonitor may know that a path hasn't been modified but > "git reset --hard" did not pay attention to it and performed > its own check based on ie_match_stat(), which was inefficient. > > or something like that? > > > if (old && same(old, a)) { > > int update = 0; > > - if (o->reset && o->update && !ce_uptodate(old) && !ce_skip_worktree(old)) { > > + if (o->reset && o->update && !ce_uptodate(old) && !ce_skip_worktree(old) && > > + !(old->ce_flags & CE_FSMONITOR_VALID)) { > > I wonder if !ce_uptodate(old) should say "this one is up to date and > not modified" when CE_FSMONITOR_VALID bit is set. Are there other > codepaths that use ce_uptodate(ce) to decide to do X without paying > attention to CE_FSMONITOR_VALID bit? If there are, are they buggy > in the same way as you found this instance, or do they have legitimate > reason why they only check ce_uptodate(ce) and ignore fsmonitor? > Yes, there are other code paths as well. After reading the code some more, it seems like there's no legitimate need to ignore fsmonitor. > If there isn't, would it make sense to get rid of CE_FSMONITOR_VALID > bit and have fsmonitor directly set CE_UPTODATE bit instead? That > would make this fix unnecessary and fix other codepaths that check > only ce_uptodate() without checking fsmonitor. > There's a few issues with replacing it entirely that I've found. One is the "CE_MATCH_IGNORE_FSMONITOR" flag. This flag can be set to let ie_match_stat skip calling refresh_fsmonitor repeatedly. This is set only in one place right now in preload-index, and it's unclear how necessary this optimization even is, given that refresh_fsmonitor has a check whether it's been called already, and returns if true. The second is that git ls-files has an "f" option that makes it "use lowercase letters for 'fsmonitor clean' files". I think this can simply be replaced by checking if a file is up to date instead of specifically via fsmonitor. If we do go ahead with the replace, we will have to be diligent about calling refresh_fsmonitor everywhere, or we will have correctness issues. I patched git locally to do this, and immediately saw a bug in git stash where the underlying git reset --hard skipped modifying a file it should have. In my opinion refresh_fsmonitor should be called somewhere top level, like an initialization, but I'm not sure if that makes sense for all git subcommands. Do you think it's worth cleaning up and sending this patch instead? It will reduce the surface area of bugs and remove a bunch of functions like mark_fsmonitor_valid/mark_fsmonitor_invalid