Josip Sokcevic <sokcevic@xxxxxxxxxx> writes: > diff --git a/diff-lib.c b/diff-lib.c > index d8aa777a73..664613bb1b 100644 > --- a/diff-lib.c > +++ b/diff-lib.c > @@ -39,11 +39,22 @@ > static int check_removed(const struct index_state *istate, const struct cache_entry *ce, struct stat *st) > { > assert(is_fsmonitor_refreshed(istate)); Not a problem this patch introduces, but doesn't this call path diff_cache() -> unpack_trees() -> oneway_diff() -> do_oneway_diff() -> show_new_file(), show_modified() -> get_stat_data() -> check_removed() violate the assertion? If so, perhaps we should rewrite it into a more explicit "if (...) BUG(...)" that is not compiled away. > - if (!(ce->ce_flags & CE_FSMONITOR_VALID) && lstat(ce->name, st) < 0) { > - if (!is_missing_file_error(errno)) > - return -1; > - return 1; > + if (ce->ce_flags & CE_FSMONITOR_VALID) { > + /* > + * Both check_removed() and its callers expect lstat() to have > + * happened and, in particular, the st_mode field to be set. > + * Simulate this with the contents of ce. > + */ > + memset(st, 0, sizeof(*st)); It is true that the original, when CE_FSMONITOR_VALID bit is set, bypasses lstat() altogether and leaves the contents of st completely uninitialized, but this is still way too insufficient, isn't it? There are three call sites of the check_removed() function. * The first one in run_diff_files() only cares about st.st_mode and other members of the structure are not looked at. This makes readers wonder if the "st" parameter to check_removed() should become "mode_t *st_mode" to clarify this point, but the primary thing I want to say is that this caller will not mind if we leave other members of st bogus (like 0-bit filled) as long as the mode is set correctly. * The second one in run_diff_files() passes the resulting &st to match_stat_with_submodule(), which in turn passes it to ie_match_stat(), which cares about "struct stat" members that are used for quick change detection, like owner, group, mtime. Giving it a bogus st will most likely cause it to report a change. * The third one is in get_stat_data(). This also uses the &st to call match_stat_with_submodule(), so it is still totally broken to give it a bogus st, the same way as the second caller above. > + st->st_mode = ce->ce_mode; Does this work correctly when the cache entry points at a gitlink, which uses 0160000 that is not a valid st_mode? I think you'd want to use a reverse function of create_ce_mode(). > + } else { > + if (lstat(ce->name, st) < 0) { > + if (!is_missing_file_error(errno)) > + return -1; > + return 1; > + } > } At this point, if FSMONITOR_VALID bit is not set, we will always perform lstat() and get all the members of st populated properly, which is a definite improvement. While I think this does not make it worse (it is an existing bug that the code is broken for a ce with the CE_FSMONITOR_VALID bit set), we may want to leave a note that we _know_ the code after this patch is still broken. "Simulate this with ..." -> "Just setting st_mode is still insufficient and will break majority of callers". It may make sense, until we clean it up, to disable the check for the FSMONITOR_VALID bit in this codepath and always perform lstat(). Optimization matters, but computing quickly in order to return an incorrect result is optimizing for a wrong thing. I dunno. Thanks.