Re: Bug: Changing folder case with `git mv` crashes on case-insensitive file system

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



So, I'm just a dumb Git user who doesn't even write C, so much of this
discussion is over my head, but I have a few thoughts that may be
helpful:

• The mv utility on Mac is capable of doing `mv bär.txt bÄr.txt` just
fine. Maybe `git mv` can learn something from whatever `mv` does?

• On a case-insensitive file system, `git mv somedir sOMEdir` is a
rename. But on a case-sensitive file system, it might NOT be a rename;
it might be the case that `somedir` and `sOMEdir` both exist and that
the command should put `somedir` inside `sOMEdir`. I mention this
because I can imagine some naive attempts at fixing the original bug
by doing a case-insensitive comparison of the two names ending up
breaking this behaviour on case-sensitive file systems by wrongly
treating such a command as a rename. It's probably worth having a test
that this scenario gets handled cleanly on case-sensitive file
systems? (I haven't checked whether Torsten's proposed diff falls into
this trap or not.)

• Above, Torsten mentions that there are filesystem-specific rules
about what names are equal to each other that Git can't easily handle,
because they go beyond just ASCII case changes. In that case, maybe
the right solution is to always defer the question to the filesystem
rather than Git trying to figure out the answer "in its head"?

  That is: first check the inode or file ID of the src and dst passed
to `git mv`. If they are different and the second one is a folder,
move src inside the existing folder. If either they are the same or
the second one is not a folder, then do a rename.

  It seems to me that this approach automatically handles stuff like
`git mv bär.txt bÄr.txt` plus any other rules about names being equal
(like two different sequences of code points that both express "à"),
all without Git ever needing to explicitly check whether two names are
case-insensitively equal. Am I missing something?

Sorry if any of the above is dumb or if I'm reiterating things others
have already said without realising it.

On Thu, May 6, 2021 at 5:34 AM Torsten Bögershausen <tboegi@xxxxxx> wrote:
>
> On Wed, May 05, 2021 at 09:23:05AM +0900, Junio C Hamano wrote:
> > Torsten Bögershausen <tboegi@xxxxxx> writes:
> >
> > > To my undestanding we try to rename
> > > foo/ into FOO/.
> > > But because FOO/ already "exists" as directory,
> > > Git tries to move foo/ into FOO/foo, which fails.
> > >
> > > And no, the problem is probably not restricted to MacOs,
> > > Windows and all case-insenstive file systems should show
> > > the same, but I haven't tested yet, so it's more a suspicion.
> > >
> > > The following diff allows to move foo/ into FOO/
> > > If someone wants to make a patch out if, that would be good.
> >
> > Is strcasecmp() sufficient for macOS whose filesystem has not just
> > case insensitivity but UTF-8 normalization issues?
> >
>
> Strictly speaking: no.
>
> The Git code doesn't handle UTF-8 uppper/lower case at all:
> git mv bar.txt BAR.TXT works because strcasecmp() is catching it.
>
> git mv bär.txt BÄR.TXT needs the long way:
> git mv bär.txt baer.txt && git mv baer.txt BÄR.TXT
>
> We have been restricting the case-change-is-allowed to ASCII filenames
> all the time.
> There is no information, which code points map onto each other in Git,
> since this is all file system dependent.
> NTFS has one way, HFS+, APFS another, VFAT a third one, and if I expose
> ext4 via SAMBA we probably have another one.
> Not mentioniong that ext4 can be use case-insensitve on later Linux kernels,
> which sticks to unicode.
> Or Git repos running on machines using ISO-8859-1, those should be rare these
> days.
>
> That said, people are renaming files in ASCII only and are happy,
> and in that sense renaming directories in ASCII can be supported
> without major hassle.
>
> And the inode approach mentioned as well:
> This could go on top of strcasecmp() to cover non-ASCII filenames
> or other oddities, if someone implements it.
>
>




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux