Re: [RFC PATCH 3/3] grep: add option to ignore sparsity patterns

Elijah Newren <newren@xxxxxxxxx> · Tue, 31 Mar 2020 09:48:47 -0700



On Sun, Mar 29, 2020 at 6:13 PM Matheus Tavares Bernardino
<matheus.bernardino@xxxxxx> wrote:
>
> On Thu, Mar 26, 2020 at 3:02 AM Elijah Newren <newren@xxxxxxxxx> wrote:
> >
> > Hi Matheus!
>
> Hi, Elijah.
>
> First of all, thanks for taking the time to go over these topics in
> great detail. I must say it's much clearer for me now.
>
> > On Wed, Mar 25, 2020 at 4:15 PM Matheus Tavares Bernardino
> > <matheus.bernardino@xxxxxx> wrote:
> > >
> [...]
> > One more useful case to consider before we start adding SKIP_WORKTREE
> > into the mix.  Let's say that you have three files:
> >    fileA
> >   fileB
> >    fileC
> > and all of them are tracked.  You have made edits to fileA and fileB,
> > and ran 'rm fileC' (NOT 'git rm fileC', i.e. the deletion is not
> > staged).  Now, you run 'git grep mystring'.  Quick question: Which
> > files are searched for 'mystring'?  Well...
> >   * REVISION and --cached were left out of the git grep command, so
> > working tree files should be searched, not staged versions or versions
> > from other commits
> >  * No flags like --untracked or --no-exclude-standard were included,
> > so only tracked files in the working tree should be searched
> >   * There are two files in the working tree, both tracked: fileA and fileB.
> > So, this searches fileA and fileB.  In particular: NO VERSION of fileC
> > is searched.  fileC may be tracked/cached, but we don't search any
> > version of that file, because this particular command line is about
> > searching the working directory and fileC is not in the working
> > directory.  To the best of my knowledge, git grep has always behaved
> > that way.
> >
> > Users understand the idea of searching the working copy vs. the index
> > vs. "old" (or different) versions of the repository.  They also
> > understand that when searching the working copy, by default a subset
> > of the files are searched.  Tell me: given all this information here,
> > what possible explanation is there for SKIP_WORKTREE entries to be
> > translated into searches of the cache when --cached is not specified?
> > Please square that away with the fact that 'rm fileC' results in fileC
> > NOT being searched.
> >
> > It's just completely, utterly wrong.
>
> Makes sense, thanks. I agree that we shouldn't fall back to the cache
> when searching the working tree.
>
> > Also, hopefully this helps answer your question about --untracked and
> > skip_worktree.  --untracked is only useful when searching through the
> > working tree, and is entirely about adding the "untracked" category to
> > the things we search.  The skip_worktree bit is about adding more
> > granularity to the "tracked" category.  The two are thus entirely
> > orthogonal and --untracked shouldn't change behavior at all in the
> > face of sparse checkouts.
>
> Thanks, your explanation clarified the issue I had. I see now why
> --untracked and --ignore-sparsity don't make sense together.
>
> It also made me think about the combination of --cached and
> --untracked which, IIUC, should be prohibited. I will add a patch in
> v2, making git-grep error out in this case.
>
> > And I also think it explains more when the sparsity patterns and
> > --ignore-sparsity-patterns flags even matter.  The division of working
> > tree files which were tracked into two subsets (those that match
> > sparsity patterns and those that don't) didn't matter because only one
> > of those two sets existed and could be searched.  So the question is,
> > when can the sparsity pattern divide a set of files into two subsets
> > where both are non-empty?  And the answer is when --cached or REVISION
> > is specified.
>
> Makes sense. I will add in --ignore-sparsity's description that it is
> only relevant with --cached or REVISION, as you previously suggested.
> When it is used outside of these cases, though, I think we could just
> warn that --ignore-sparsity will be discarded (to avoid erroring out
> when users have grep.ignoreSparsity enabled).

Not grep.ignoreSparsity but core.ignoreSparsity or core.$WHATEVER  ;-)