On Mon, Jan 10, 2022 at 7:47 AM Victoria Dye <vdye@xxxxxxxxxx> wrote: > > Elijah Newren wrote: > > On Tue, Jan 4, 2022 at 9:37 AM Victoria Dye via GitGitGadget > > <gitgitgadget@xxxxxxxxx> wrote: > >> > >> From: Victoria Dye <vdye@xxxxxxxxxx> > >> > >> Introduce tests for a variety of `git update-index` use cases, including > >> performance scenarios. > > > > Makes sense. > > > >> Tests for `update-index add/remove` are specifically > >> focused on how `git stash` uses `git update-index` as a subcommand to > >> prepare for sparse index integration with `stash` in a future series. > > > > This is possibly a tangent, but I'd rather that if we were trying to > > fix `git stash`, that we instead would do so by making it stop forking > > subprocesses and having it call internal API instead. See for > > example, a4031f6dc0 ("Merge branch 'en/stash-apply-sparse-checkout' > > into maint", 2021-02-05) which did this. The fact that it forks so > > many subprocesses is a bug, and comes from the fact that stash is a > > partial conversion from shell to C. I think the subprocess forking is > > part of the problem that causes issues for sparse-checkouts, as I > > spelled out in patches 2 & 3 of the series I mentioned above. > > > > However, that doesn't affect this series. > > > >> Co-authored-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx> > >> Signed-off-by: Victoria Dye <vdye@xxxxxxxxxx> > >> --- > >> t/perf/p2000-sparse-operations.sh | 1 + > >> t/t1092-sparse-checkout-compatibility.sh | 125 +++++++++++++++++++++++ > >> 2 files changed, 126 insertions(+) > >> > >> diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh > >> index 54f8602f3c1..7dbed330160 100755 > >> --- a/t/perf/p2000-sparse-operations.sh > >> +++ b/t/perf/p2000-sparse-operations.sh > >> @@ -118,5 +118,6 @@ test_perf_on_all git diff --cached > >> test_perf_on_all git blame $SPARSE_CONE/a > >> test_perf_on_all git blame $SPARSE_CONE/f3/a > >> test_perf_on_all git checkout-index -f --all > >> +test_perf_on_all git update-index --add --remove > >> > >> test_done > >> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh > >> index 6ecf1f2bf8e..6804ab23a27 100755 > >> --- a/t/t1092-sparse-checkout-compatibility.sh > >> +++ b/t/t1092-sparse-checkout-compatibility.sh > >> @@ -630,6 +630,131 @@ test_expect_success 'reset with wildcard pathspec' ' > >> test_all_match git ls-files -s -- folder1 > >> ' > >> > >> +test_expect_success 'update-index modify outside sparse definition' ' > >> + init_repos && > >> + > >> + write_script edit-contents <<-\EOF && > >> + echo text >>$1 > >> + EOF > >> + > >> + # Create & modify folder1/a > >> + run_on_sparse mkdir -p folder1 && > >> + run_on_sparse cp ../initial-repo/folder1/a folder1/a && > >> + run_on_all ../edit-contents folder1/a && > > > > As I've mentioned to Stolee, I'd rather these were explicitly marked > > as intentionally setting up an erroneous condition. > > > > However, that might not matter if my other series gets accepted (the > > one I promised to send out yesterday, but then spent all day > > responding to emails instead). Hopefully I'll send it soon. > > > >> + > >> + # If file has skip-worktree enabled, update-index does not modify the > >> + # index entry > >> + test_sparse_match git update-index folder1/a && > >> + test_sparse_match git status --porcelain=v2 && > >> + test_must_be_empty sparse-checkout-out && > >> + > >> + # When skip-worktree is disabled (even on files outside sparse cone), file > >> + # is updated in the index > >> + test_sparse_match git update-index --no-skip-worktree folder1/a && > >> + test_all_match git status --porcelain=v2 && > >> + test_all_match git update-index folder1/a && > >> + test_all_match git status --porcelain=v2 > > > > These make sense. > > > >> +' > >> + > >> +test_expect_success 'update-index --add outside sparse definition' ' > >> + init_repos && > >> + > >> + write_script edit-contents <<-\EOF && > >> + echo text >>$1 > >> + EOF > >> + > >> + # Create folder1, add new file > >> + run_on_sparse mkdir -p folder1 && > >> + run_on_all ../edit-contents folder1/b && > >> + > >> + # Similar to `git add`, the untracked out-of-cone file is added to the index > >> + # identically across sparse and non-sparse checkouts > >> + test_all_match git update-index --add folder1/b && > >> + test_all_match git status --porcelain=v2 > > > > The comment is not correct: > > > > It is correct, but for *untracked* out-of-cone files only. Those files don't > have a `skip-worktree` bit because they're not in the index in the first > place. > The comment is intended to highlight the fact that `update-index` > (like `git add`, `git status`, etc.) "decides" whether to operate on a file > in a sparse-checkout based on `skip-worktree`, *not* the sparse patterns. git-update-index may not pay attention to the sparsity patterns for untracked files, but `git add` does. Let me demonstrate. First, I set up a simple repo where the following is true: $ git ls-files -t H in-cone/foo.c S out-of-cone/tracked $ git status --porcelain ?? out-of-cone/initially-untracked Now, let's compare `git add` and `git add --sparse`. First, without --sparse: $ git add out-of-cone/initially-untracked The following paths and/or pathspecs matched paths that exist outside of your sparse-checkout definition, so will not be updated in the index: out-of-cone/initially-untracked hint: If you intend to update such entries, try one of the following: hint: * Use the --sparse option. hint: * Disable or modify the sparsity rules. hint: Disable this message with "git config advice.updateSparsePath false" $ git ls-files -t H in-cone/foo.c S out-of-cone/tracked So, `git add $UNTRACKED` did not add the file. In contrast: $ git add --sparse out-of-cone/initially-untracked $ git ls-files -t H in-cone/foo.c H out-of-cone/initially-untracked S out-of-cone/tracked So, `git add --sparse $UNTRACKED` did add it. > Seeing as the comparison to `git add` makes things more confusing, I'll > rephrase the test comment. > > > $ git add out-of-cone/file > > The following paths and/or pathspecs matched paths that exist > > outside of your sparse-checkout definition, so will not be > > updated in the index: > > out-of-cone/file > > hint: If you intend to update such entries, try one of the following: > > hint: * Use the --sparse option. > > hint: * Disable or modify the sparsity rules. > > hint: Disable this message with "git config advice.updateSparsePath false" > > > > I might buy that `git update-index` is lower level and should be > > considered the same as `git add --sparse`, but the comment should > > mention that and try to sell why update-index should be equivalent to > > that instead of to `git add`. > > > > Tracked, out-of-cone files aren't affected by `--add` (the flag allows > `update-index` to add untracked files), and `update-index out-of-cone/tracked` > would ignore the file. Yes, I believe you're explaining update-index behavior correctly. > so I believe the behavior of `update-index` is > currently more consistent with `git add` than `git add --sparse`. But not quite `git add`'s. Just to be clear, let's add update-index to the above comparison I did between add and add --sparse. First, let's go back to the initial setup point: $ git ls-files -t H in-cone/foo.c S out-of-cone/tracked $ git status --porcelain ?? out-of-cone/initially-untracked Now, let's try update-index: $ git update-index --add out-of-cone/initially-untracked $ git ls-files -t H in-cone/foo.c H out-of-cone/initially-untracked S out-of-cone/tracked So, in other words, `git update-index --add $UNTRACKED` matches the behavior of `git add --sparse $UNTRACKED`, not the behavior of `git add $UNTRACKED`. > >> +' > >> + > >> +test_expect_success 'update-index --remove outside sparse definition' ' > >> + init_repos && > > > >> + > >> + # When `--ignore-skip-worktree-entries` is specified, out-of-cone files are > >> + # not removed from the index if they do not exist on disk > >> + test_sparse_match git update-index --remove --ignore-skip-worktree-entries folder1/a && > >> + test_all_match git status --porcelain=v2 && > > > > The file is present despite being marked to be missing, we're ignoring > > the intention of the marking, and we ask for it to be removed, so we > > don't remove it? > > > > The levels of negation are _very_ confusing. It took me a while to > > unravel this. I think the logic is something like this > > * folder1/a is marked as SKIP_WORKTREE, meaning it's not supposed to > > be in the worktree. > > * and it's not. > > * We are stating that we're ignoring SKIP_WORKTREE, though, so this > > looks like a regular file that has been deleted. > > So, what would `git update-index --remove $FILE` do for a normal $FILE > > deleted from the working copy? According to the docs: > > > > --remove > > If a specified file is in the index but is missing then it’s > > removed. Default behavior is to ignore removed file. > > > > So, the docs say it would remove it. But you don't. > > > > > > After digging around and looking at the testcase below, if I had to > > guess what happened, I would say that you figured out what the > > SKIP_WORKTREE behavior was, and assumed that was correct, and added a > > flag that allowed you to request the opposite behavior. > > Unfortunately, I think the pre-existing behavior is buggy. > > > > I understand why you find it buggy, but I am not making baseless assumptions > about the correctness (or lack thereof) of the current behavior. To be clear, the fact that the behavior was there for a decade would typically be basis enough for an assumption (in my opinion), and I wouldn't have faulted folks for making it. I might well have done so myself. My reasoning was just that I was getting confused by the negations and trying to understand the testcase, and when I started to unravel it, I found what looked like a possible inconsistency. Anyway, it's clear here you've actually dug a lot deeper and know the history here. In contrast, I was making assumptions about the history (and ones that weren't correct, though I'd argue my assumptions weren't baseless)... > This > specific "gap" in `update-index --remove` has been discussed in the past [1] > and was acknowledged as non-ideal and a candidate for change in the future. > At the time, the introduction of `--ignore-skip-worktree-entries` [2] was a > "safe" way to ignore `skip-worktree` without changing the default behavior. > > Personally, I think updating the default behavior of `--remove` (and > corresponding deprecation of `--[no-]ignore-skip-worktree-entries`) is > probably the right way to go. However, I'd like to avoid including it in > this series because it deviates pretty substantially from the goal > "integrate with sparse index", and as a result has the potential to > overshadow/derail the rest of the series. If you're alright with (slightly) > deferring changes to the behavior of `--remove`, I can submit a separate > series for it once this one has stabilized. > > [1] https://lore.kernel.org/git/xmqq36fda3i8.fsf@xxxxxxxxxxxxxxxxxxxxxxxxx/ > [2] https://lore.kernel.org/git/163b42dfa21c306dc1dc573c5edfc8bda5c99fd0.1572432578.git.gitgitgadget@xxxxxxxxx/ Oh, wow. Yeah, fixing it in a later series seems fine. However, could you add a comment to these testcases that --remove has behavior that violates the definition of `SKIP_WORKTREE`, and that we might want to fix that in the future but for now you are just testing the pre-existing behavior for coverage? (There is a possibility we can't fix it for some reason when we dig in. In that case, we should update the documentation for `--remove` to call out this special case. But again, that can be for a later series.) > > Of course, there's lots of negation here. Did I get something > > backwards by chance? > > > >> + > >> + # When the flag is _not_ specified ... > > > > In my head I'm translating this as: > > > > SKIP_WORKTREE = !worktree > > --ignore-skip-worktree-entries = !!worktree > > "flag is _not_ specified" = !!!worktree ? > > > > I'm not sure I would do anything to change it, but just pointing out > > it can be a little hard for others to come up to speed. > > > > Most of the confusion likely comes from the non-standard behavior of > `--remove`, but I think I can distill it into (somewhat) straightforward > statements about `update-index`: > > 1. When using the command *without* either `--ignore-skip-worktree-entries` > OR `--remove`, `skip-worktree` entries provided to the command are > ignored. > 2. When using the command *with* `--remove` and *without* > `--ignore-skip-worktree-entries`, `skip-worktree` entries are *not* > ignored, and are removed from the index. > 3. When both `--remove` and `--ignore-skip-worktree-entries` are specified, > `skip-worktree` entries are again ignored. > 4. `--ignore-skip-worktree-entries` has no effect without `--remove` also > specified > > The goal of this test, then, is to exercise conditions 2 & 3 and directly > show their effect on `skip-worktree` entries. Yeah, it makes sense to have good test coverage. +1. > > >> ... , out-of-cone, not-on-disk files are > >> + # removed from the index > >> + rm full-checkout/folder1/a && > >> + test_all_match git update-index --remove folder1/a && > >> + test_all_match git status --porcelain=v2 && > > > > Documentation/git-update-index.txt defines SKIP_WORKTREE as follows: > > > > "Skip-worktree bit can be defined in one (long) sentence: When reading > > an entry, if it is marked as skip-worktree, then Git pretends its > > working directory version is up to date and read the index version > > instead." > > > > If Git is pretending the file is up-to-date, and `git update-index > > --remove $UP_TO_DATE_FILE` is typically a no-op because the --remove > > flag doesn't do anything when a file is present in the working copy, > > then why is this the expected behavior? > > > > I know it's the traditional behavior of update-index, but > > SKIP_WORKTREE support in Git has traditionally been filled with holes. > > So, was this behavior by design (despite contradicting the > > documentation), or by accident? > > > > As far as I can tell, it appears to have been intentional in the original > `skip-worktree` implementation [3], but given Junio & Johannes' discussion > on the `--ignore-skip-worktree-entries` patch [1], the sentiment now would > probably lean towards having `--remove` ignore `skip-worktree`. > > [1] https://lore.kernel.org/git/xmqq36fda3i8.fsf@xxxxxxxxxxxxxxxxxxxxxxxxx/ > (copied from earlier in this message) > [3] https://lore.kernel.org/git/1250776033-12395-5-git-send-email-pclouds@xxxxxxxxx/ Allow me to comment on this history... I can see sometimes wanting to make a command line option a special case that doesn't follow the general documented rules. Not sure I believe it in this specific case, but I can see it as being a possibility in general. But when you're going to make an option not follow the otherwise documented behavior, then that option's documentation should have a special callout about how it diverges and/or why it's a special case. So, this seems like a double-layered problem to me (not only choosing the wrong behavior, but leaving the documentation to claim it does something else). It looks like Dscho partially tried to fix it, but I would have preferred that Dscho's documentation comment he added to `--ignore-skip-worktree-entries` be added to `--remove`; it's odd that folks wanting to learn about `--remove` behavior need to read the documentation for `--ignore-skip-worktree-entries` (even if they aren't using it) in order to understand `--remove`'s behavior. > > (To be fair, I think the definition given in the manual for > > SKIP_WORKTREE is horrible for other reasons, so I don't like leaning > > on it. But I used different logic above in the > > --ignore-skip-worktree-entries case to arrive at the same conclusion > > that the --remove behavior of update-index seems to be backwards. > > Unless I missed a negation in both cases somewhere? There are so many > > floating around...) > > > >> + # NOTE: --force-remove supercedes --ignore-skip-worktree-entries, removing > >> + # a skip-worktree file from the index (and disk) when both are specified > >> + test_all_match git update-index --force-remove --ignore-skip-worktree-entries folder1/a && > >> + test_all_match git status --porcelain=v2 > > > > Makes sense. > > > >> +' > >> + > >> +test_expect_success 'update-index with directories' ' > >> + init_repos && > >> + > >> + # update-index will exit silently when provided with a directory name > >> + # containing a trailing slash > >> + test_all_match git update-index deep/ folder1/ && > >> + grep "Ignoring path deep/" sparse-checkout-err && > >> + grep "Ignoring path folder1/" sparse-checkout-err && > > > > Is this desired behavior or just current behavior? > > > >> + > >> + # When update-index is given a directory name WITHOUT a trailing slash, it will > >> + # behave in different ways depending on the status of the directory on disk: > >> + # * if it exists, the command exits with an error ("add individual files instead") > >> + # * if it does NOT exist (e.g., in a sparse-checkout), it is assumed to be a > >> + # file and either triggers an error ("does not exist and --remove not passed") > >> + # or is ignored completely (when using --remove) > >> + test_all_match test_must_fail git update-index deep && > >> + run_on_all test_must_fail git update-indexe folder1 && > > > > This one will fail for the wrong reason, though -- `update-indexe` is > > not a valid subcommand. (extra 'e' at the end) > > > > Thanks for catching that! I'll update in my next re-roll. > > >> + test_must_fail git -C full-checkout update-index --remove folder1 && > >> + test_sparse_match git update-index --remove folder1 && > >> + test_all_match git status --porcelain=v2 > > > > Otherwise these seem reasonable. > > > >> +' > >> + > >> +test_expect_success 'update-index --again file outside sparse definition' ' > >> + init_repos && > >> + > >> + write_script edit-contents <<-\EOF && > >> + echo text >>$1 > >> + EOF > > > > Copy and paste and forget to remove? edit-contents doesn't seem to be > > used in this test. > > > > Will remove. > > >> + > >> + test_all_match git checkout -b test-reupdate && > >> + > >> + # Update HEAD without modifying the index to introduce a difference in > >> + # folder1/a > >> + test_sparse_match git reset --soft update-folder1 && > >> + > >> + # Because folder1/a differs in the index vs HEAD, > >> + # `git update-index --remove --again` will effectively perform > >> + # `git update-index --remove folder1/a` and remove the folder1/a > >> + test_sparse_match git update-index --remove --again && > >> + test_sparse_match git status --porcelain=v2 > > > > This might need a --ignore-skip-worktree-entries, as per the > > discussion above. Otherwise, this test makes sense. > > > > The `--ignore-skip-worktree-entries` option is explicitly omitted because > this test needs `update-index` to modify a `skip-worktree` entry. However, > given the debate around what `--remove` should do, I'll update the scenario > to not use `--remove` or any variation of it. > > >> +' > >> + > >> +test_expect_success 'update-index --cacheinfo' ' > >> + init_repos && > >> + > >> + deep_a_oid=$(git -C full-checkout rev-parse update-deep:deep/a) && > >> + folder2_oid=$(git -C full-checkout rev-parse update-folder2:folder2) && > >> + folder1_a_oid=$(git -C full-checkout rev-parse update-folder1:folder1/a) && > >> + > >> + test_all_match git update-index --cacheinfo 100644 $deep_a_oid deep/a && > >> + test_all_match git status --porcelain=v2 && > >> + > >> + # Cannot add sparse directory, even in sparse index case > >> + test_all_match test_must_fail git update-index --add --cacheinfo 040000 $folder2_oid folder2/ && > >> + > >> + # Sparse match only - because folder1/a is outside the sparse checkout > >> + # definition (and thus not on-disk), it will appear "deleted" in > >> + # unstaged changes. > >> + test_all_match git update-index --add --cacheinfo 100644 $folder1_a_oid folder1/a && > >> + test_sparse_match git status --porcelain=v2 > > > > Makes sense, because the update-index command removes the existing > > cache entry and adds a new one without the SKIP_WORKTREE bit. But it > > might be worth mentioning that in the commit message. Also, you could > > follow this up with `git update-index --skip-worktree folder1/a`, and > > then do a test_all_match git status --porcelain=v2, to show that when > > the SKIP_WORKTREE bit is restored back to the file, then it again does > > as expected despite not being on-disk. > > > > I really like this - it helps clarify how `update-index` can be used to > correctly add a sparse-checkout entry to the index with plumbing commands. > I'll include it in V2. > > >> +' > >> + > >> test_expect_success 'merge, cherry-pick, and rebase' ' > >> init_repos && > >> > >> -- > >> gitgitgadget