Re: [EXTERNAL] Re: BUG: git clean -d cannot remove files from read-only directories

Elijah Newren <newren@xxxxxxxxx> · Thu, 20 Feb 2020 16:42:38 -0800

Hi Adam,

On Thu, Feb 20, 2020 at 3:52 PM Adam Milazzo <Adam.Milazzo@xxxxxxxxxxxxx> wrote:
>
> > Simply because that is how users would expect how the world works (iow, model things after what they are already familiar with).
>
> This seems to be an avoidance of my actual arguments about 1) the purpose of "git clean" and what behavior best matches it, and 2) the violation of the general principle that if a tool invoked programmatically can fail, then there should be a reasonable way for users to avoid the failure if possible. But my response is:
>
> First, there is no obvious choice for what other tool to model "git clean" on, even assuming that it should be so modeled. This goes back to the purpose of "git clean". Is it just a recursive delete? Or it bringing the directory tree back to a certain state? I'd argue the latter, and if we want point to existing tools I'd point to rsync, which has no problem deleting files from read-only directories if it's needed to bring a directory tree to the desired state. It doesn't even give a warning about it.

I'm very sympathetic to the fact that "git clean" behavior might not
be optimal or even well defined[1][2][3].  I've recently done work in
the area, including even changing the existing behavior of some
commands, based on arguments about what should be correct behavior.
Some of that work languished for a year and a half, despite fixing
known bugs, because there were edge cases where I couldn't tell what
correct behavior was and no one else seemed to be able to answer
either.

[1] https://lore.kernel.org/git/20190917163504.14566-1-newren@xxxxxxxxx/
[2] https://lore.kernel.org/git/pull.676.v5.git.git.1576790906.gitgitgadget@xxxxxxxxx/
[3] https://lore.kernel.org/git/pull.692.v3.git.git.1579206117.gitgitgadget@xxxxxxxxx/

If you want to do something similar, I think you need to provide good
rationale for not only what you are trying to achieve, but explain the
edge cases and how to address probable future similar requests in a
way that make sense to someone who might try to implement it.  If I
were to try to implement your suggestion, I'm saying that your
descriptions are not at all clear to me in terms of how I should
handle edge cases and future related improvements that folks ask for,
not even if this is implemented as a new option.

For example, you talk about bringing the directory tree back to a
certain state -- does that mean git clean should also run 'git reset
--hard'?  I need a more precise model/description...

> Second, I doubt anybody here actually knows (i.e. has data demonstrating) that users expect 'git clean' to behave like 'rm'. Also, I am a user, and it is not what _I_ expect. (And since some people here seem keen to dismiss what I say based on an assumption of ignorance, I've been programming for 30 years, using GNU/Linux, BSD, and other UNIX-like systems for almost 20 years, and using various source control systems for about as long. Not that that should carry any intrinsic weight in this discussion.)
>
> Comparing to "rm" again, there is an easy way for users of "rm" to avoid the error. Simply replace "rm -rf X" with "chmod -R u+w X; rm -rf X". What is the comparable workaround with "git clean"? There is none that I'm aware of, and that's perhaps the main reason why it would be useful for "git clean" to be able to handle it. If there is a reasonable workaround, what is it? The best simple workarounds I've been able to come up with are:
>
> * For "git clean -fd": git status -s -uall | grep -E '^\?\?.*/$' | cut -c 4- | xargs -r chmod -R u+w; git clean -fd
> * For "git clean -fdx": git status -s -uall --ignored | grep -E '^\?\?.*/$' | cut -c 4- | xargs -r chmod -R u+w; git clean -fdx
> * For "git clean -fX": ??
> * For "git clean -f": ??

For every single case, why not just "chmod -R u+w $toplevel_dir"
followed by the git clean command in question, much like you did with
rm?

> These are not reliable because there are various conditions where they fail (including ours), so I'm not sure they are viable approaches except in certain special cases. It's possible to handle all the possibilities with custom scripting, but the workarounds would become quite complex.
>
> So I ask again, if "git clean" won't have any option to handle it like rsync does, what is the workaround that can be placed in a script to get the same behavior? And if there is no reasonable workaround, perhaps it is a useful feature to have "git clean" try a little harder to delete the files, or have an option to do so?

I think the single simple recursive chmod I mentioned above is a
reasonable workaround.

But since you are bringing up rsync as a comparison point...rsync also
affects ACLs, xattrs, devices and other special files, etc.  So, how
much harder is a little harder?  Is there a good mental model for that
being the right amount of harder, or do we just keep extending it
every time the command fails to clean something that users think we
could have wiped out?

Hope that helps,
Elijah