Re: What's cooking in git.git (Jan 2010, #07; Fri, 22)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano <gitster@xxxxxxxxx> writes:

> Jens Lehmann <Jens.Lehmann@xxxxxx> writes:
> ...
>> @@ -220,7 +224,7 @@ static int get_stat_data(struct cache_entry *ce,
>>  			 const unsigned char **sha1p,
>>  			 unsigned int *modep,
>>  			 int cached, int match_missing,
>> -			 unsigned *dirty_submodule, int output_format)
>> +			 unsigned *dirty_submodule, struct diff_options *diffopt)
>>  {
>>  	const unsigned char *sha1 = ce->sha1;
>>  	unsigned int mode = ce->ce_mode;
>
> Below the context of this hunk, we seem to do this:
>
> 	if (!cached && !ce_uptodate(ce)) {
>         	...
>                 if gitlink then call is_submodule_modified()
> 	}
>
> But isn't it inconsistent with hunk at the beginning of this patch (ll 161-170)
> that essentially says "entries that is ce_uptodate() is Ok, but if it is a
> gitlink we need to look deeper"?  Why isn't this function looking deeper
> even when we see that the submodule entry is ce_uptodate()?
>
>     Side note: the lack of ce_skip_worktree() check is Ok.  The callers of
>     get_stat_data() are show_new_file() and show_mododified() but they are
>     never called from their sole caller, do_oneway_diff(), on a skipped
>     worktree entry.

I think we need to clarify the rule for ce_uptodate(ce).

The rule has always been that an entry that is ce_uptodate(ce) is _known_
to be clean, and nobody should have to dig deeper to double check.  We
should keep it that way.

We at least need to fix preload_index() not to mark any gitlink entries
with ce_mark_uptodate(ce), as your series changes the definition of a
dirty submodule.  It used to be that if a submodule is checked out and its
HEAD matches what is recorded in the index of the superproject, then the
submodule is clean.  The checks made in preload_thread() is consistent
with this definition.

With the update, we consider that having local changes in the submodule
(either to the index or to the work tree files) makes it modified (which
by the way is a right definition, and prevents people from forgetting to
commit in the submodule and updating the superproject index with the new
submodule commit, before commiting the state in the superproject).

Because the checks in preload_thread() does not cover this kind of change
that is_submodule_modified() reports, it shouldn't mark a gitlink entry as
ce_uptodate(ce).

Another possibility is to run the is_submodule_modified() check inside
ie_match_stat(), but (1) I don't know if that function is thread-safe, and
(2) I don't think we would want to do it in preload-index time (it is
rather expensive).

The reason preload_index() passes CE_MATCH_RACY_IS_DIRTY to ie_match_stat() 
is to avoid doing a rather expensive ce_modified_check_fs() --- it avoids
the overhead and leaves the expensive check to the true callers that care
if the entry is really clean.  In the same sense, even if we were to run
is_submodule_modified() there, we would want to avoid doing so when we are
running ie_match_stat() from preload codepath.

We need to also see if there other codepaths that call
ce_mark_uptodate(ce) on a gitlink that hasn't been checked with
is_submodule_modified(), and eliminate them.

Then we can fix the "even though ce_uptodate(ce) says this entry is clean,
if it is gitlink, we need to double check" insanity around ll 164 of
diff-lib.c.  We should be able to trust ce_uptodate(ce) even for gitlinks.

What do you think?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]