Re: [PATCH v3 02/11] setup: introduce startup_info->original_cwd

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Sun, 28 Nov 2021 19:04:25 +0100

On Sat, Nov 27 2021, Elijah Newren wrote:

> On Sat, Nov 27, 2021 at 6:00 AM Ævar Arnfjörð Bjarmason
> <avarab@xxxxxxxxx> wrote:
>>
>> On Fri, Nov 26 2021, Elijah Newren via GitGitGadget wrote:
>>
>> [Just some more "I haven't really looked at this in all that much
>> detail" commentary, so maybe it's stupid, sorry]
>>
>> > From: Elijah Newren <newren@xxxxxxxxx>
>> >
>> > Removing the current working directory causes all subsequent git
>> > commands run from that directory to get confused and fail with a message
>> > about being unable to read the current working directory:
>> >
>> >     $ git status
>> >     fatal: Unable to read current working directory: No such file or directory
>> >
>> > Non-git commands likely have similar warnings or even errors, e.g.
>> >
>> >     $ bash -c 'echo hello'
>> >     shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
>> >     hello
>>
>> Is that really realistic?
>
> I have 321 shell scripts (or symlinks thereto) in /usr/bin/ on my
> Fedora laptop, and 951 shell scripts in /usr/bin/ on my Ubuntu
> workstation at $DAYJOB.  That's not even considering stuff in other
> directories.  Note that I didn't place any of those scripts in
> /usr/bin/; they came from the distribution and/or corporate packages
> from others.  Any invocation of bash will see the above "error"
> message.  Granted, bash calls it an 'error' but continues anyway, but
> I thought it showed that there were clearly programs besides git where
> users could run into problems.
>
>> Any "normal" command would use "pwd" or look
>> at $PWD, both of which "work", this error is only because we're starting
>> a new shell.
>
> <grin>
>
> Yeah, good point.  Who would use an unusual program like git?  Or a
> shell script?  Or a java program?  Or emacs?  Or other programs like
> them?  Besides, git, bash, java, and emacs are all relatively young
> commands with small development teams.  Maybe we should just rely on
> users only using commands that get the pwd/$PWD detail right; perhaps
> limiting to commands that are more mature and have bigger development
> teams than those four.
>
> Silly users.
>
> ;-)
>
> Sorry, couldn't resist a little friendly teasing.
>
>
> However, for "pwd"...by "work", do you mean "doesn't necessarily
> work"?  On my machine:
>
> $ mkdir gone
> $ cd gone
> $ rmdir ../gone
> $ pwd -P
> pwd: error retrieving current directory: getcwd: cannot access parent
> directories: No such file or directory
> $ pwd
> pwd: error retrieving current directory: getcwd: cannot access parent
> directories: No such file or directory
> $ echo $PWD
> /home/newren/floss/git/gone
>
> If I do not run `pwd -P` first then a plain `pwd` works.  But your
> advice to use `pwd` seems misguided for programs, since they'll hit
> this problem if users have run a previous `pwd -P`.  (The $PWD trick
> would have worked, as shown above)
>
>> I wonder if it was just because you ran into our bin-wrappers edge case,
>> but that should be really obscure for any real users.

For some reason I was under the misimpression that the "#!/bin/bash"
part of the bin-wrappers and other scripts somehow immunized them from
the $PWD/"pwd" reset, and it was only the programs they invoked (like
git in the bin-wrappers) that didn't get the values passed along.

But that's clearly incorrect as you demonstrate above, so the only thing
that'll work OK (seemingly) is running "pwd" (but not "pwd -P"), or
looking at $PWD in your terminal shell itself.

Invoking non-POSIX shell programs "works" in that they can use the same
trick, after the dance of "mkdir x && cd x && rm -rf ..x" e.g. Perl
says:

    $ perl -MCwd=getcwd -wE 'say $ENV{PWD}; say getcwd'
    /home/avar/g/git/x
    Use of uninitialized value in say at -e line 1

This "pwd -P" behavior isn't just something weird in your shell & mine,
it semse to be mandated by POSIX:
https://pubs.opengroup.org/onlinepubs/007904875/utilities/pwd.html

All of which is to say I'm much more sympathetic to this approach
now. I.e. it seemed like purely a way to work around fixable breakages
in our own and other programs. Well, I guess "don't use shellscript if
you care about this edge case" is a "fix", but not a trivial one.

I do still think a better starting point would be fixing the setup.c
dying in our own code, and see where that leaves us, but up to you
etc. I'm not going to work on it any time soon.

> Yes, I agree our bin-wrappers is a really obscure edge case not worth
> considering.  I don't see how you jump from there to painting all
> shell scripts combined with the same brush, though.

*nod*

>> > This confuses end users, particularly since the command they get the
>> > error from is not the one that caused the problem; the problem came from
>> > the side-effect of some previous command.
>> >
>> > We would like to avoid removing the current working directory of our
>> > parent process; towards this end, introduce a new variable,
>> > startup_info->original_cwd, that tracks the current working directory
>> > that we inherited from our parent process.  For convenience of later
>> > comparisons, we prefer that this new variable store a path relative to
>> > the toplevel working directory (thus much like 'prefix'), except without
>> > the trailing slash.
>>
>> I'm still not clear at all on why we need a "original_cwd" at all then
>> as opposed to just using "prefix" (or adding "the_prefix" if passing it
>> down is painful). I.e. we discover our relative path, we resolve the
>> relative path to the root, can't we use that as the "don't remove our
>> CWD" guard?
>>
>> Does our prefix change at some point, then maybe "orig_prefix" would
>> make more sense?
>
> No; see the code comment in the same patch:
>
> +       /*
> +        * startup_info->original_cwd points to the current working
> +        * directory we inherited from our parent process, which is a
> +        * directory we want to avoid removing.
> +        *
> +        * For convience, we would like to have the path relative to the
> +        * worktree instead of an absolute path.
> +        *
> +        * Yes, startup_info->original_cwd is usually the same as 'prefix',
> +        * but differs in two ways:
> +        *   - prefix has a trailing '/'
> +        *   - if the user passes '-C' to git, that modifies the prefix but
> +        *     not startup_info->original_cwd.
> +        */
>
> It's never equal to prefix, even though it's usually semantically
> referring to the same directory.  However, even if it weren't for the
> trailing slash issue, the -C case means it is not appropriate to think
> of it as "orig_prefix" either.

Ah, with -C of e.g. t/helper we'll first chdir(t/helper), and then run
the usual setup.c dance to find that we need to chdir() again to the
(equivalent of) "../../". But our prefix stays at "t/helper".

I'm a bit confused about the trailing slash case, isn't the prefix
always going to point to a directory? Why would t/helper v.s. t/helper/
matter?

I think it won't matter for rmdir(2) et al, but maybe I'm wrong.

What got me confused about the "prefix" v.s. "original_cwd" is that I
was assuming they'd be the same. The commentary on setup_git_directory()
says as much, i.e.:

    Returns the "prefix", a path to the current working directory
    relative to the work tree root, or NULL,

But of course we know that's a white lie, it's not the $PWD/getcwd(). So
you're only trying to save the user in cases of e.g. (in t/helper):

    git rm ../helper

If they actually run it while in t/helper, but would like to explicitly
omit the case of (at the top-level):

    git -C t/helper rm ../helper

That's fair enough I guess. I'd just assumed those cases would be
treated the same way.

Even in that case, I can't think of a case where this "original_cwd"
wouldn't be made redundant by some boolean flag to accompany "prefix" to
indicate that we've chdir'd out of the prefix (or not).