Junio C Hamano <gitster@xxxxxxxxx> writes: > Jens Lehmann <Jens.Lehmann@xxxxxx> writes: > ... >> @@ -220,7 +224,7 @@ static int get_stat_data(struct cache_entry *ce, >> const unsigned char **sha1p, >> unsigned int *modep, >> int cached, int match_missing, >> - unsigned *dirty_submodule, int output_format) >> + unsigned *dirty_submodule, struct diff_options *diffopt) >> { >> const unsigned char *sha1 = ce->sha1; >> unsigned int mode = ce->ce_mode; > > Below the context of this hunk, we seem to do this: > > if (!cached && !ce_uptodate(ce)) { > ... > if gitlink then call is_submodule_modified() > } > > But isn't it inconsistent with hunk at the beginning of this patch (ll 161-170) > that essentially says "entries that is ce_uptodate() is Ok, but if it is a > gitlink we need to look deeper"? Why isn't this function looking deeper > even when we see that the submodule entry is ce_uptodate()? > > Side note: the lack of ce_skip_worktree() check is Ok. The callers of > get_stat_data() are show_new_file() and show_mododified() but they are > never called from their sole caller, do_oneway_diff(), on a skipped > worktree entry. I think we need to clarify the rule for ce_uptodate(ce). The rule has always been that an entry that is ce_uptodate(ce) is _known_ to be clean, and nobody should have to dig deeper to double check. We should keep it that way. We at least need to fix preload_index() not to mark any gitlink entries with ce_mark_uptodate(ce), as your series changes the definition of a dirty submodule. It used to be that if a submodule is checked out and its HEAD matches what is recorded in the index of the superproject, then the submodule is clean. The checks made in preload_thread() is consistent with this definition. With the update, we consider that having local changes in the submodule (either to the index or to the work tree files) makes it modified (which by the way is a right definition, and prevents people from forgetting to commit in the submodule and updating the superproject index with the new submodule commit, before commiting the state in the superproject). Because the checks in preload_thread() does not cover this kind of change that is_submodule_modified() reports, it shouldn't mark a gitlink entry as ce_uptodate(ce). Another possibility is to run the is_submodule_modified() check inside ie_match_stat(), but (1) I don't know if that function is thread-safe, and (2) I don't think we would want to do it in preload-index time (it is rather expensive). The reason preload_index() passes CE_MATCH_RACY_IS_DIRTY to ie_match_stat() is to avoid doing a rather expensive ce_modified_check_fs() --- it avoids the overhead and leaves the expensive check to the true callers that care if the entry is really clean. In the same sense, even if we were to run is_submodule_modified() there, we would want to avoid doing so when we are running ie_match_stat() from preload codepath. We need to also see if there other codepaths that call ce_mark_uptodate(ce) on a gitlink that hasn't been checked with is_submodule_modified(), and eliminate them. Then we can fix the "even though ce_uptodate(ce) says this entry is clean, if it is gitlink, we need to double check" insanity around ll 164 of diff-lib.c. We should be able to trust ce_uptodate(ce) even for gitlinks. What do you think? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html