Re: [PATCH] unpack-trees: add core.sparseCheckoutRmFiles config

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Tue, Jun 1, 2021 at 11:14 PM Tim Renouf (open source)
<tpr.ll@xxxxxxxxxxxx> wrote:
>
> Hi all
>
> Thanks for the reviews and comments.
>
> My use case is that I never want it to remove or otherwise touch those files outside of sparse-checkout that happen to be the same path as index paths. I realize that currently gives me complications (e.g. I must never try and merge/cherry-pick/rebase a commit that might cause a merge conflict out there), and I realize that’s not what everyone else wants. For example, I don’t want git reset --hard or whatever to remove those files. Hence the config option.

Every case I've ever seen of present-despite-skip-worktree files is an
accidental and erroneous condition.  If you're trying to use it for
something else, we'd really need to understand the higher level
use-cases so that we can make ranges of commands work together nicely.
The sparse checkout capability itself was started by a low-level
implementational detail ("treat paths as though they matched HEAD and
don't write them to the working tree") and led to a maze of surprises,
bugs, edge cases, etc.  I'd rather not compound that problem by adding
another configuration option defined via a similar low-level
implementational detail.

I'm also leaning towards having `git reset --hard` not clear
present-despite-skip-worktree files, and not having `git status`
report them; both seem like an unnecessary performance issue for this
rare accidental/erroneous condition.  I totally agree that `git
switch/checkout` should not delete or overwrite such files if they
differ from HEAD; but similar to how having a dirty file prevents a
branch switch to some branch that deletes the file, a
present-despite-skip-worktree file that differs from HEAD should
result in an error message and an aborted switch/checkout.  At least
for the standard sparse-checkout behavior; don't know what your
usecase is.

> Am I right in saying that the sparse-index work makes it easier to achieve my use case? In that those outside-sparse-checkout paths would not ever get merged into the index, even if I merge/cherry-pick/rebase a tree with paths there?

If you merge/cherry-pick/rebase a commit with changes to a path
outside the sparse-checkout, and it merges cleanly with your current
branch, then with sparse-index it wouldn't even show up in the index
(though one of its parent trees would be modified because of the
changes to that file).  But that's not very different than without the
sparse-index, at least with the ort merge backend (available and ready
for using starting with git-2.32).  The recursive merge strategy (the
default) is known to write files to the working tree despite the
sparse-checkout requests, even when the merge is clean, but that's
just a bug in the recursive backend.  (One that isn't easily fixable
within its design, which is one of many reasons it was being written
in the form of the ort backend.)

If you merge/cherry-pick/rebase a commit with changes to a path
outside the sparse-checkout, and it conflicts with your branch, then
with or without sparse-index, that file is going to show up in the
index with higher order stages and be written to the working tree.
That is, so long as you don't have a file in the way.  Since the ort
backend makes use of the checkout machinery to switch from the current
state to the new state, fixing checkout to throw an error and abort
when a present-despite-skip-worktree file is present would cause
merges/rebases/cherry-picks to do the same.

> I can go into more details on how I arrived ay my use case if it helps.
>
> So maybe there are two separate things here:
>
> 1. The bug that checkout removes my file when it is dirty, instead of refusing (unless -f) or just ignoring it.

Yep, we need to fix this.

> 2. My use case, which is to do its best to never remove or otherwise touch worktree files outside sparse-checkout.

Can you expand on this use case a bit?  Should git diff or git status
report on your changes for your usecase?  Should "git restore" ignore
the given paths and not overwrite it?  What if the user explicitly
listed the path in question?  Should stash save these changes or
ignore them?  Should add include these changes or ignore them?  Since
your indexed file is different than the worktree file, should "git mv"
move just the file recorded in the index, just the one in the
worktree, or both?  If someone tries to run "git archive", should your
modified files be included?  If you don't like anything touching these
paths, does that mean they are also uninteresting?  So for example,
should "git log -p" or "git grep $REVISION" only return results inside
the sparse-checkout list of paths?

>From a higher level, what are you trying to achieve?  Is it similar to
`git update-index --assume-unchanged`?  The point of sparse-checkout
was to reduce the number of files in the working tree, but it appears
you aren't trying to do that; you are putting files back into the
working tree anyway.  So, what's the point of sparse-checkout then?

It's possible sparse-checkout doesn't meet your needs, and just isn't
designed to meet them, and we need another special concept for your
case.  Or perhaps there's a config option that's meaningful.  Or maybe
you're just struggling with the bugs of sparse-checkout, of which
there are *many*.  See e.g.
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@xxxxxxxxxxxxxx/

> > I'm also worried that making a config option may have masked subtle
> > bugs in the patch that the rest of the testsuite would have turned up.
>
> It is true that not hiding it in a config option makes a few tests fail, including ones that specifically test that a git reset after a materialization from a merge conflict causes the file to be removed.

Yeah, not so surprising that it has weird interactions with merging
(and perhaps different weird interactions with different merge
backends).  Anything that touches unpack-trees.c either needs to be
part of standard operation, or if it's behind a special config option,
then it needs to be accompanied with a huge number of tests from many
different commands since it affects so many things.  We're trying to
add a battery of tests for sparse-checkout and sparse-index, and we
still obviously need a lot more.

Hope that helps,
Elijah




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux