Re: Regression in 8d92fb292706 ("dir: replace exponential algorithm with a linear one")

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 6, 2021 at 4:16 AM Uwe Kleine-König
<u.kleine-koenig@xxxxxxxxxxxxxx> wrote:
>
> Hello,
>
> I admit this is somewhat of a corner case, still it happens in the
> reality of our admin team ...
> Initially this was noticed after upgrading the OS from Debian buster
> (with git 2.20.1) to Debian bullseye (with git 2.30.2).
>
> (wgit is just a wrapper for git to call it from my ~/src/git.)
>
> This is the good ("old") case:
>
>         uwe@taurus:~/tmp/8d92fb29270$ wgit version
>         git version 2.25.2.7.g0bbd0e8b5233
>
>         uwe@taurus:~/tmp/8d92fb292706$ wgit init
>         Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/.git/
>
>         uwe@taurus:~/tmp/8d92fb292706$ mkdir subdir
>         uwe@taurus:~/tmp/8d92fb292706$ cd subdir/
>         uwe@taurus:~/tmp/8d92fb292706/subdir$ wgit init
>         Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/subdir/.git/
>         uwe@taurus:~/tmp/8d92fb292706/subdir$ cd ..
>
>         uwe@taurus:~/tmp/8d92fb292706$ echo content > subdir/somefile
>         uwe@taurus:~/tmp/8d92fb292706$ wgit add subdir/somefile
>         uwe@taurus:~/tmp/8d92fb292706$ wgit status
>         On branch master
>
>         No commits yet
>
>         Changes to be committed:
>           (use "git rm --cached <file>..." to unstage)
>                 new file:   subdir/somefile

Eek, that's bad.  I think there's a number of dragons going down that route.

> with 8d92fb292706, the following happens:
>
>         uwe@taurus:~/tmp/8d92fb292706$ wgit version
>         git version 2.25.2.8.g8d92fb292706
>         uwe@taurus:~/tmp/8d92fb292706$ wgit init
>         Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/.git/
>         uwe@taurus:~/tmp/8d92fb292706$ mkdir subdir
>         uwe@taurus:~/tmp/8d92fb292706$ cd subdir/
>         uwe@taurus:~/tmp/8d92fb292706/subdir$ wgit init
>         Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/subdir/.git/
>         uwe@taurus:~/tmp/8d92fb292706/subdir$ cd ..
>         uwe@taurus:~/tmp/8d92fb292706$ echo content > subdir/somefile
>         uwe@taurus:~/tmp/8d92fb292706$ wgit add subdir/somefile

Not optimal; more on this below.

>         uwe@taurus:~/tmp/8d92fb292706$ wgit status
>         On branch master
>
>         No commits yet
>
>         Untracked files:
>           (use "git add <file>..." to include in what will be committed)
>                 subdir/
>
>         nothing added to commit but untracked files present (use "git add" to track)

This part looks good to me.

> So git after 8d92fb292706 doesn't add files from a subdirectory if
> said subdirectory is tracked in git, too.
>
> While I'm not sure which of the two behaviours is the bogus one, this is
> a change in behaviour that I guess wasn't intended in 8d92fb292706.

I put some effort separate from that commit into avoiding accidentally
recursing into nested git dirs; see e.g. commit 09487f2cba ("clean:
avoid removing untracked files in a nested git repository",
2019-09-17).  So, I was slightly surprised that some other commit
hadn't fixed this.

However, it's not all that surprising to me that 8d92fb292706 affected
this.  Prior to that commit, we visited untracked paths which were n
directories deep a ridiculous 2^n times.  But what made it even more
fun was that the status returned for any given path (tracked, ignored,
not interesting to the traversal, etc.) was not always the same; later
traversals might return something different than earlier traversals.
That confusion made it real "fun" trying to ensure no regressions when
reducing the number of visits to any given path from 2^n down to 1.
The fact that side effects of the traversals (the population of the
dir.entries and dir.ignored) could have essentially functioned to
override a later traversal's return status certainly didn't help; it
was such a mess.

But, interestingly, the fixed behavior here also depends pretty
strongly on commit b9670c1f5e ("dir: fix checks on common prefix
directory", 2019-12-19) which came months earlier.  This is
particularly important in combination with the following comment from
dir.h:
    /**
     * If set, recurse into a directory that looks like a Git directory.
     * Otherwise it is shown as a directory.
     */
    DIR_NO_GITLINKS = 1<<3,
which suggests that cmd_add() (which didn't set this flag) should have
never been recursing into a Git-tracked directory.  In other words,
this was a bug all along.

> Is this something that needs fixing?

I agree with brian elsewhere in this thread that not adding the file
is correct.  However, two points:

* I would prefer to see a warning/error from git add when it doesn't
add a path  (Any takers?  #leftoverbits maybe?)

* It is possible that one might want to be able to force the addition
of files to an outer repository despite existing within a directory
tracked by an inner git repository, perhaps with a double `--force`
being passed to git-add (much like git-clean allows).  If so, that
could be implemented via the addition of
    dir.flags |= DIR_NO_GITLINKS;
to cmd_add() when the double force is detected.


Hope that helps,
Elijah




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux