Re: git 2.34.0: Behavior of `**` in gitignore is different from previous versions.

Chris Torek <chris.torek@xxxxxxxxx> · Sat, 20 Nov 2021 14:41:44 -0800

On Fri, Nov 19, 2021 at 12:44 PM Derrick Stolee <stolee@xxxxxxxxx> wrote:
> So the problem is this: I want to know "I have a file named <X>, and
> a certain pattern set, does <X> match the patterns or not?" but in
> fact it's not just "check <X> against the patterns in order"  ...

Right.  You're starting with the wrong problem by assuming that
you *do* have file named <X> in the first place.  You would not
have it at all in some cases.

I've always found this somewhat puzzling, or at least very tricky
to explain, in the presence of a file that is tracked because it's
already in the index, despite having a parent directory that would
have caused the file to not have been found.  So the standard
explanation -- at least, the one I use -- is this:

 * Git opens and reads the working tree directory.  For each file
   or directory that is actually present here, Git checks it
   against the ignore rules.  Some rules match only directories
   and others match both directories and files.  Some rules say
   "do ignore" and some say "do not ignore".

 * The *last* applicable rule wins.

 * If this is a file and the file is ignored, it's ignored.
   Unless, that is, it's in the index already, because then it's
   tracked and can't be ignored.

 * If this is a directory and the directory is ignored, it's
   not even opened and read.  It's not in the index because
   directories are never in the index (at least nominally).
   If it is opened and read, the entire set of rules here
   apply recursively.

This works, but skips over files that are in the index and are in
a directory that won't be read.  So I add one last rule, which is
that already-tracked files are checked despite not being scanned
during the above process.  (That's because the process above is
only used to determine which files to *complain about* at `git
status` time, or auto-add with `git add --all` or the like.)

So: these files are not ignored ... but is their directory ever
read?  The actual answer, per testing, is "no", which matches with
the parenthetical in the paragraph just above.  But no
documentation says this, explicitly, one way or another.

Incidentally, while I have no patches to contribute at this point,
I do think it would be sensible for Git to read a `.gitignore`
that says:

    *
    !a/b/c

as meaning:

    *
    !a/
    !a/b/
    !a/b/c

That is, declaring an un-ignored file within some ignored
directory should automatically imply that we *must* un-ignore the
parents.  It doesn't seem like it should be that hard to insert
some extra rules internally here (though without `*` we'd want
to ignore `a/*`, i.e., the above is optimized beyond what I
would expect from automated rule insertion).

Chris