Re: [RFC PATCH v2 4/4] config: add setting to ignore sparsity patterns in some cmds

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, May 9, 2020 at 9:23 PM Matheus Tavares Bernardino
<matheus.bernardino@xxxxxx> wrote:
>
> On Sat, May 9, 2020 at 9:42 PM Matheus Tavares
> <matheus.bernardino@xxxxxx> wrote:
> >
> > diff --git a/t/t7817-grep-sparse-checkout.sh b/t/t7817-grep-sparse-checkout.sh
> > index 3bd67082eb..8509694bf1 100755
> > --- a/t/t7817-grep-sparse-checkout.sh
> > +++ b/t/t7817-grep-sparse-checkout.sh
> > @@ -63,12 +63,28 @@ test_expect_success 'setup' '
> >         test_path_is_file sub/B/b
> >  '
> >
> > +# The two tests bellow check a special case: the sparsity patterns exclude '/b'
> > +# and sparse checkout is enable, but the path exists on the working tree (e.g.
> > +# manually created after `git sparse-checkout init`). In this case, grep should
> > +# honor --restrict-to-sparse-paths.
>
> I just want to highlight a small thing that I forgot to comment on:
> Elijah and I had already discussed about --restrict-to-sparse-paths
> being relevant in grep only with --cached or when a commit-ish is
> given. But it had not occurred to me, before, the possibility of the
> special case mentioned above. I.e. when searching in the working tree
> and a path that should be excluded by the sparsity patterns is
> present. In this patch, I let --restrict-to-sparse-paths control the
> desired behavior for grep in this case too. But please, let me know if
> that doesn't seem like a good idea.

Wow, that is an interesting edge case.  But it can come up during a
merge or rebase or checkout -m, could be manually changed by various
plumbing commands, and might just not be enforced well in various
areas of the system (see e.g. [1]).  Perhaps the most interesting
case, given recent discussion, is submodules -- those might be left in
the working tree despite not matching sparsity paths.  So, should `git
-c sparse.restrictCmds=true grep PATTERN` look at these paths or not?
Currently, you've chosen contradictory answers -- yes to submodules,
and no to other entries.  I'm not certain here, but I've given it a
little thought and think there's a few things to take into
consideration:

Users are used to the fact that
    grep -r PATTERN *
searches existing files for PATTERN.  If you delete a file, then a
subsequent grep isn't going to search through it.  Similarly, git grep
is billed as a grep which limits searches to tracked files, thus they
expect
    git grep PATTERN
to search for files in their working copy but limiting it to files
which are tracked.  From this angle, I think users would be surprised
if `git grep` searched through deleted files, and they would also be
surprised if it ignored tracked and present files.

That is a basic answer, but let's go a bit further.  Since git grep
also has history at its disposal, it has more options.  For example:
    git grep REVISION PATTERN
means to search through all tracked files (those are the only kinds
that are recorded in revisions anyway) as of REVISION for the given
PATTERN, without checking it out.  Users probably expect this to
behave the same as:
    git checkout REVISION
    git grep PATTERN
and since checkout pays attention to sparsity rules, this is why we'd
want to have both "git grep PATTERN" and "git grep REVISION PATTERN"
pay attention to sparsity rules.

When we think in terms of "git grep REVISION PATTERN" as an optimized
version of "git checkout REVISION && git grep PATTERN" it puts us in
the frame of mind of asking the following question:
   For each path, would it be marked as SKIP_WORKTREE if we were to
check it out right now?  If so, we should skip it for the grepping.
Usually, the SKIP_WORKTREE bit is set for files if and only if they
don't match the sparsity patterns.  Also, we can't use the
SKIP_WORKTREE bit of the current index to decide whether to grep
through an old REVISION, because there are paths that exists in the
old revision that don't exist in the current index.  The sparsity
rules are the only things that can tell us whether such a path would
be marked as SKIP_WORKTREE if we were to check it out.  So it makes
sense to use the sparsity patterns when looking at REVISIONS.  When
dealing with the current worktree, we can check SKIP_WORKTREE
directly.  Usually that'll give the same answer as asking the sparsity
rules but as per [1] the two aren't always identical.  Rather than
asking "Would we mark this as SKIP_WORKTREE if we were to checkout
this version right now?", perhaps we should ask "Since we have this
version checked out right now, let's just check the path directly.  Is
it marked as SKIP_WORKTREE?".

Does that sound reasonable?

[1] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@xxxxxxxxxxxxxxxxxxxxxxxxx/



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux