Re: [PATCH 1/7] reset: behave correctly with sparse-checkout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"Kevin Willford via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:

> @@ -127,12 +129,49 @@ static void update_index_from_diff(struct diff_queue_struct *q,
>  		struct diff_options *opt, void *data)
>  {
>  	int i;
> +	int pos;
>  	int intent_to_add = *(int *)data;
>  
>  	for (i = 0; i < q->nr; i++) {
>  		struct diff_filespec *one = q->queue[i]->one;
> +		struct diff_filespec *two = q->queue[i]->two;
>  		int is_missing = !(one->mode && !is_null_oid(&one->oid));
> +		int was_missing = !two->mode && is_null_oid(&two->oid);

Not a problem introduced by this patch per-se, but is_missing is a
counter-intuitive name for what the boolean wants to represent, I
think, which was brought in by b4b313f9 (reset: support "--mixed
--intent-to-add" mode, 2014-02-04).  Before the commit, we used to
say

 	for (i = 0; i < q->nr; i++) {
 		struct diff_filespec *one = q->queue[i]->one;
		if (one->mode && !is_null_sha1(one->sha1)) {
			... create ce out of one and add to the	index ...
		} else
 			remove_file_from_cache(one->path);
		...

i.e. "if one is not missing, create a ce and add it, otherwise
remove the path".

It should have been called "one_is_missing" if we wanted to
literally express the condition the code checked, but an even better
name would have been given after the intent of what the code wants
to do with the information.  If the resetted-to tree (that is what
'one' side of the comparison in diff_cache() is) has a valid blob,
we want it to be in the index, and otherwise, we do not want it in
the index.

Now, the patch makes things worse and I had to do the above digging
to see why the new code is even more confusing.  The 'two' side of
the comparison is what is in the to-be-corrected-by-reset index.
"was_missing" in contrast to "is_missing" makes it sound as if it
was the state before whatever "is_missing" tries to represent, but
that is not what is happening.  "is_missing" does not mean "the
entry is currently not there in the index", but "was_missing" does
mean exactly that: "the entry is currently not there in the index".

There isn't any "was" missing about it.  It "is" missing in the
index.  Instead of renaming, I wonder if we can do without this new
variable.  Let's read on the patch.

Also, now the code uses both sides of the filepair, we must double
check that our do_diff_cache() is *not* doing any rename detection.
It might be even prudent to ensure that 

	if (strcmp(one->path, two->path))
		BUG("reset drove diff-cache with rename detection");

but it might be with too much paranoia.  I dunno.

>  		struct cache_entry *ce;
> +		struct cache_entry *ce_before;
> +		struct checkout state = CHECKOUT_INIT;

These two new variables do not need this wide a scope, I would
think.  Shouldn't it be inside the body of the new "if" statement
this patch adds?

> +		/*
> +		 * When using the sparse-checkout feature the cache entries
> +		 * that are added here will not have the skip-worktree bit
> +		 * set. Without this code there is data that is lost because
> +		 * the files that would normally be in the working directory
> +		 * are not there and show as deleted for the next status.
> +		 * In the case of added files, they just disappear.
> +		 *
> +		 * We need to create the previous version of the files in
> +		 * the working directory so that they will have the right
> +		 * content and the next status call will show modified or
> +		 * untracked files correctly.
> +		 */
> +		if (core_apply_sparse_checkout && !file_exists(two->path)) {

In a sparsely checked out working tree, there is nothing in the
working tree at the path.  It may be because it is sparse and we
didn't want to have anything there, or it may be because the user
wanted to get rid of it and said "rm path" (not "git rm path") and
this part of the tree were of interest even if the sparse checkout
feature was used to hide other parts of the tree.  With the above
two checks alone, we cannot tell which.  Let's read on.

> +			pos = cache_name_pos(two->path, strlen(two->path));

We check the index to see if there is an entry for it.  I suspect
that because we need to do this check anyway, we shouldn't even have
to look at 'two' (and add a new 'was_missing' variable), because
'one' and 'two' came from a comparison between the resetted-to tree
object and the current index, and if cache_name_pos() for the path
(we can use 'one->path') says there is an entry in the index, by
definition, 'two' would not be showing a "removed" state (i.e. "the
resetted-to tree had it, the index does not" is what "was_missing"
wants to say).

So I wonder if it is better to

 - use one->path for !file_exists() above and cache_name_pos() here
   instead of two->path.

 - drop the confusingly named 'was_missing', because (pos < 0) is
   equivalent to it after this point, and we didn't need it up to
   this point.

> +			if ((pos >= 0 && ce_skip_worktree(active_cache[pos])) &&

And we do find an entry for it.  So this path is not something
sparse cone specifies not to check out (otherwise we would have a
tree-like entry that covers this path in the index and not an entry
for this specific path)?

Anyway, if it is marked with the skip-worktree bit, does that mean
there is no risk that the reason why two->path does not exist in the
working tree is because we earlier gave it in the working tree but
it was later removed by the user?  Just making sure that we are not
breaking the end-user's wish that the path should be removed by
resurrecting it in the working tree with a new call to
checkout_entry().

> +			    (is_missing || !was_missing)) {

And in such a case, if the resetted-to tree says we shouldn't have
the path in the resulting index, or if the original state in the
index had this path (but because (0 <= pos) must be true for us to 
reach this point, I am not sure if "was_missing" can ever be true
here), then do the following, which is ...

> +				state.force = 1;
> +				state.refresh_cache = 1;
> +				state.istate = &the_index;
> +
> +				ce_before = make_cache_entry(&the_index, two->mode,
> +							     &two->oid, two->path,
> +							     0, 0);
> +				if (!ce_before)
> +					die(_("make_cache_entry failed for path '%s'"),
> +						two->path);
> +
> +				checkout_entry(ce_before, &state, NULL, NULL);

... to resurrect the last "git add"ed state from the index and write
it out to the working tree.  As I suspected, ce_before and state
should be scoped inside this block and not visible outside, no?

I am not sure why this behaviour is desirable.  The "mixed" reset
should not have to touch the working tree in the first place.

The large comment before this block says "... will not have the
skip-worktree bit set", but we are dealing with a case where the
original index had a cache entry there with skip-worktree bit set,
so isn't the more desirable outcome that the cache entry added back
to the index has the skip-worktree bit still set and there is no
working tree file that the user did not desire to have?

And isn't it the matter of preserving the skip-worktree bit when the
code in the post context of this hunk this patch did not touch adds
the entry back to the index?

> +			}
> +		}
>  
>  		if (is_missing && !intent_to_add) {
>  			remove_file_from_cache(one->path);

If we look at the code after this point, we do use "is_missing"
information to tweak ce->ce_flags with the intent-to-add bit.

Perhaps we can do a similar tweak to the cache entry to mark it with
skip-worktree bit if the index had a cache entry at the path with
the bit set?  The code that needs to do so would only have to
remember if the one->path is in the current index and the cache
entry for the path has the skip-worktree bit in the body of the new
if() statement about (core_apply_sparse_checkout && !file_exists())
added by this patch (I am not sure if !file_exists() even matters,
though, as the approach I am suggesting is to preserve the skip bit
and not disturb the working tree files at all).

Thanks.







[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux