On Tue, Jul 20, 2021 at 1:14 PM Derrick Stolee via GitGitGadget <gitgitgadget@xxxxxxxxx> wrote: > > From: Derrick Stolee <dstolee@xxxxxxxxxxxxx> > > When running unpack_trees() with a sparse index, we attempt to operate > on the index without expanding the sparse directory entries. Thus, we > operate by manipulating entire directories and passing them to the > unpack function. In the case of the 'git checkout' command, this is the > twoway_merge() function. > > There are several cases in twoway_merge() that handle different > situations. One new one to add is the case of a directory/file conflict > where the directory is sparse. Before the sparse index, such a conflict > would appear as a list of file additions and deletions. Now, > twoway_merge() initializes 'current', 'oldtree', and 'newtree' from > src[0], src[1], and src[2], then sets 'oldtree' to NULL because it is > equal to the df_conflict_entry. The way to determine that we have a > directory/file conflict is to test that 'current' and 'newtree' disagree > on being sparse directory entries. > > When we are in this case, we want to resolve the situation by calling > merged_entry(). This allows replacing the 'current' entry with the > 'newtree' entry. This is important for cases where we want to run 'git > checkout' across the conflict and have the new HEAD represent the new > file type at that path. The first NEEDSWORK comment dropped in t1092 > demonstrates this necessary behavior. > > However, we still are in a confusing state when 'current' corresponds to > a staged change within a sparse directory that is not present at HEAD. > This should be atypical, because it requires adding a change outside of > the sparse-checkout cone, but it is possible. Since we are unable to > determine that this is a staged change within twoway_merge(), we cannot > add a case to reject the merge at this point. I believe this is due to > the use of df_conflict_entry in the place of 'oldtree' instead of using > the valud at HEAD, which would provide some perspective to this > decision. Any change that would allow this differentiation for staged > entries would need to involve information further up in unpack_trees(). > > That work should be done, sometime, because we are further confusing the > behavior of a directory/file conflict when staging a change in the > directory. The two cases 'checkout behaves oddly with df-conflict-?' in > t1092 demonstrate that even without a sparse-checkout, Git is not > consistent in its behavior. Neither of the two options seems correct, > either. This change makes the sparse-index behave differently than the > typcial sparse-checkout case, but it does match the full checkout > behavior in the df-conflict-2 case. > > Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx> > --- > t/t1092-sparse-checkout-compatibility.sh | 24 ++++++++++++------------ > unpack-trees.c | 11 +++++++++++ > 2 files changed, 23 insertions(+), 12 deletions(-) > > diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh > index 79b4a8ce199..91e30d6ec22 100755 > --- a/t/t1092-sparse-checkout-compatibility.sh > +++ b/t/t1092-sparse-checkout-compatibility.sh > @@ -396,14 +396,14 @@ test_expect_success 'diff with renames and conflicts' ' > done > ' > > -# NEEDSWORK: the sparse-index fails to move HEAD across a directory/file > -# conflict such as when checking out df-conflict-1 and df-conflict2. > test_expect_success 'diff with directory/file conflicts' ' > init_repos && > > for branch in rename-out-to-out \ > rename-out-to-in \ > rename-in-to-out \ > + df-conflict-1 \ > + df-conflict-2 \ > fd-conflict > do > git -C full-checkout reset --hard && > @@ -667,10 +667,7 @@ test_expect_success 'checkout behaves oddly with df-conflict-1' ' > git -C sparse-checkout checkout df-conflict-1 \ > 1>sparse-checkout-out \ > 2>sparse-checkout-err && > - > - # NEEDSWORK: the sparse-index case refuses to change HEAD here, > - # but for the wrong reason. > - test_must_fail git -C sparse-index checkout df-conflict-1 \ > + git -C sparse-index checkout df-conflict-1 \ > 1>sparse-index-out \ > 2>sparse-index-err && > > @@ -684,7 +681,11 @@ test_expect_success 'checkout behaves oddly with df-conflict-1' ' > test_cmp expect full-checkout-out && > test_cmp expect sparse-checkout-out && > > + # The sparse-index reports no output > + test_must_be_empty sparse-index-out && > + > # stderr: Switched to branch df-conflict-1 > + test_cmp full-checkout-err sparse-checkout-err && > test_cmp full-checkout-err sparse-checkout-err > ' > > @@ -719,17 +720,15 @@ test_expect_success 'checkout behaves oddly with df-conflict-2' ' > git -C sparse-checkout checkout df-conflict-2 \ > 1>sparse-checkout-out \ > 2>sparse-checkout-err && > - > - # NEEDSWORK: the sparse-index case refuses to change HEAD > - # here, but for the wrong reason. > - test_must_fail git -C sparse-index checkout df-conflict-2 \ > + git -C sparse-index checkout df-conflict-2 \ > 1>sparse-index-out \ > 2>sparse-index-err && > > # The full checkout deviates from the df-conflict-1 case here! > # It drops the change to folder1/larger-content and leaves the > - # folder1 path as-is on disk. > + # folder1 path as-is on disk. The sparse-index behaves the same. > test_must_be_empty full-checkout-out && > + test_must_be_empty sparse-index-out && > > # In the sparse-checkout case, the checkout deletes the folder1 > # file and adds the folder1/larger-content file, leaving all other > @@ -741,7 +740,8 @@ test_expect_success 'checkout behaves oddly with df-conflict-2' ' > test_cmp expect sparse-checkout-out && > > # Switched to branch df-conflict-1 > - test_cmp full-checkout-err sparse-checkout-err > + test_cmp full-checkout-err sparse-checkout-err && > + test_cmp full-checkout-err sparse-index-err > ' > > test_done > diff --git a/unpack-trees.c b/unpack-trees.c > index 0a5135ab397..309c1352f5d 100644 > --- a/unpack-trees.c > +++ b/unpack-trees.c > @@ -2619,6 +2619,17 @@ int twoway_merge(const struct cache_entry * const *src, > same(current, oldtree) && !same(current, newtree)) { > /* 20 or 21 */ > return merged_entry(newtree, current, o); > + } else if (current && !oldtree && newtree && > + S_ISSPARSEDIR(current->ce_mode) != S_ISSPARSEDIR(newtree->ce_mode) && > + ce_stage(current) == 0) { > + /* > + * This case is a directory/file conflict across the sparse-index > + * boundary. When we are changing from one path to another via > + * 'git checkout', then we want to replace one entry with another > + * via merged_entry(). If there are staged changes, then we should > + * reject the merge instead. > + */ > + return merged_entry(newtree, current, o); > } else > return reject_merge(current, o); > } > -- I'm still a bit unhappy with the unpack-trees.c change (I wonder if having "path/" vs "path" is going to make D/F conflicts hard to handle and whether we need to make unpack_trees do something special to make both paths be considered at the same time with one call to twoway_merge() instead of two in order to fix this), BUT I think you've done a really good job of documenting it and pointing out that unpack_trees() messes up even without sparse checkouts on D/F conflicts (though in a different way). I think you've documented it well enough, and argued about the likelihood of issues well enough, that it makes sense to proceed and circle back and fix this up later.