Re: [PATCH 1/1] unpack-trees: skip lstat based on fsmonitor

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks for the review.

On Sun, Oct 27, 2019 at 8:37 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:
>
> "Utsav Shah via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
>
> > From: Utsav Shah <utsav@xxxxxxxxxxx>
> >
> > git stash runs git reset --hard as its final step, which can be fairly slow on a large repository.
> > This change lets us skip that if fsmonitor thinks those files aren't modified.
> >
> > git stash goes from ~8s -> 2s on my repository after this change.
>
> Please line-wrap overlong lines.
>
> More importantly, "stash" may be a mere symptom that does not
> deserve this much emphasis.  What you improved directly is "git
> reset --hard" isn't it?
>
>     The fsmonitor may know that a path hasn't been modified but
>     "git reset --hard" did not pay attention to it and performed
>     its own check based on ie_match_stat(), which was inefficient.
>
> or something like that?
>
> >       if (old && same(old, a)) {
> >               int update = 0;
> > -             if (o->reset && o->update && !ce_uptodate(old) && !ce_skip_worktree(old)) {
> > +             if (o->reset && o->update && !ce_uptodate(old) && !ce_skip_worktree(old) &&
> > +                 !(old->ce_flags & CE_FSMONITOR_VALID)) {
>
> I wonder if !ce_uptodate(old) should say "this one is up to date and
> not modified" when CE_FSMONITOR_VALID bit is set.  Are there other
> codepaths that use ce_uptodate(ce) to decide to do X without paying
> attention to CE_FSMONITOR_VALID bit?  If there are, are they buggy
> in the same way as you found this instance, or do they have legitimate
> reason why they only check ce_uptodate(ce) and ignore fsmonitor?
>

Yes, there are other code paths as well. After reading the code some
more, it seems like there's no legitimate need to ignore fsmonitor.

> If there isn't, would it make sense to get rid of CE_FSMONITOR_VALID
> bit and have fsmonitor directly set CE_UPTODATE bit instead?  That
> would make this fix unnecessary and fix other codepaths that check
> only ce_uptodate() without checking fsmonitor.
>

There's a few issues with replacing it entirely that I've found.

One is the  "CE_MATCH_IGNORE_FSMONITOR" flag. This flag can be set to
let ie_match_stat skip calling refresh_fsmonitor repeatedly. This is
set only in one place right now in preload-index, and it's unclear how
necessary this optimization even is, given that refresh_fsmonitor has
a check whether it's been called already, and returns if true.

The second is that git ls-files has an "f" option that makes it "use
lowercase letters for 'fsmonitor clean' files". I think this can
simply be replaced by checking if a file is up to date instead of
specifically via fsmonitor.

If we do go ahead with the replace, we will have to be diligent about
calling refresh_fsmonitor everywhere, or we will have correctness
issues. I patched git locally to do this, and immediately saw a bug in
git stash where the underlying git reset --hard skipped modifying a
file it should have. In my opinion refresh_fsmonitor should be called
somewhere top level, like an initialization, but I'm not sure if that
makes sense for all git subcommands.

Do you think it's worth cleaning up and sending this patch instead? It
will reduce the surface area of bugs and remove a bunch of functions
like mark_fsmonitor_valid/mark_fsmonitor_invalid



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux