Re: [PATCH v2] diff-lib: Fix check_removed when fsmonitor is on

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 7, 2023 at 11:07 AM Junio C Hamano <gitster@xxxxxxxxx> wrote:
>
> Josip Sokcevic <sokcevic@xxxxxxxxxx> writes:
>
> > diff --git a/diff-lib.c b/diff-lib.c
> > index d8aa777a73..664613bb1b 100644
> > --- a/diff-lib.c
> > +++ b/diff-lib.c
> > @@ -39,11 +39,22 @@
> >  static int check_removed(const struct index_state *istate, const struct cache_entry *ce, struct stat *st)
> >  {
> >       assert(is_fsmonitor_refreshed(istate));
>
> Not a problem this patch introduces, but doesn't this call path
>
>   diff_cache()
>   -> unpack_trees()
>      -> oneway_diff()
>         -> do_oneway_diff()
>            -> show_new_file(), show_modified()
>                -> get_stat_data()
>                   -> check_removed()
>
> violate the assertion?  If so, perhaps we should rewrite it into a
> more explicit "if (...) BUG(...)" that is not compiled away.

True, I will update it.

>
> > -     if (!(ce->ce_flags & CE_FSMONITOR_VALID) && lstat(ce->name, st) < 0) {
> > -             if (!is_missing_file_error(errno))
> > -                     return -1;
> > -             return 1;
> > +     if (ce->ce_flags & CE_FSMONITOR_VALID) {
> > +             /*
> > +              * Both check_removed() and its callers expect lstat() to have
> > +              * happened and, in particular, the st_mode field to be set.
> > +              * Simulate this with the contents of ce.
> > +              */
> > +             memset(st, 0, sizeof(*st));
>
> It is true that the original, when CE_FSMONITOR_VALID bit is set,
> bypasses lstat() altogether and leaves the contents of st completely
> uninitialized, but this is still way too insufficient, isn't it?
>
> There are three call sites of the check_removed() function.
>
>  * The first one in run_diff_files() only cares about st.st_mode and
>    other members of the structure are not looked at.  This makes
>    readers wonder if the "st" parameter to check_removed() should
>    become "mode_t *st_mode" to clarify this point, but the primary
>    thing I want to say is that this caller will not mind if we leave
>    other members of st bogus (like 0-bit filled) as long as the mode
>    is set correctly.
>
>  * The second one in run_diff_files() passes the resulting &st to
>    match_stat_with_submodule(), which in turn passes it to
>    ie_match_stat(), which cares about "struct stat" members that are
>    used for quick change detection, like owner, group, mtime.
>    Giving it a bogus st will most likely cause it to report a
>    change.
>
>  * The third one is in get_stat_data().  This also uses the &st to
>    call match_stat_with_submodule(), so it is still totally broken
>    to give it a bogus st, the same way as the second caller above.
>
> > +             st->st_mode = ce->ce_mode;
>
> Does this work correctly when the cache entry points at a gitlink,
> which uses 0160000 that is not a valid st_mode?  I think you'd want
> to use a reverse function of create_ce_mode().

I realized this too but after I sent the patch. I don't think we have
a good way to reverse it, so the best we can do is to guess it. But I
don't think we should do that. Instead, should we just zero the struct
and add a TODO? Alternative could be to use a double pointer for stat
and do more checks in call sites, but I'm not familiar with the
codebase to how that branching would need to look like.

Any preference?

-- 
Josip Sokcevic




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux