Re: [BUG REPORT] File names that contain UTF8 characters are unnecessarily escaped in 'git status .' messages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, May 29, 2021 at 11:27:52AM +0200, Torsten Bögershausen wrote:

> I am not sure that there is a reliable way for Git to detect, if the
> terminal is capable of handling UTF-8.
> This should work reliable under Linux, Windows, Mac and all the supported
> Unix-ish platforms.

Yeah, I'm not sure how such a check would be done. On most Linux systems
I've seen, $LANG will mention "en_US.UTF-8" or similar. But I've no idea
how portable that convention is, not to mention that people may have
more complex setups anyway (e.g., not setting $LANG but setting some of
LC_*).

But more importantly, this is not even a UTF-8 problem. It is "can your
terminal do something sensible with high-bit characters in filenames of
your repositories". We don't know the encoding of those filenames (and
you may even have a mix).

(And likewise "terminal" here is really "whatever consumes Git's output,
be it the terminal or some program you've piped to).

> Having said that, the default could be switched some day in the future.
> Before that is "save", there may be a transition phase,
> where users are warned that the default may change.
> Scripts calling git need to use `git -c core.quotepath=yes`, or no,
> whatever input they expect.

Yes. If we're going to do anything, I think it would be to say "most
terminals and programs deal with high-bit characters OK these days, so
switching the default is more likely to fix things than break them".

I suspect most scripts would be OK either way. They need to handle
maybe-quoted filenames already, so it is really just a question of
whether the consuming program is OK with the high bits. If so, we could
probably get away with just a mention in the release notes, rather than
an annoying transition phase (which is likely to simply confuse most
users, who are unaware of the issue entirely).

But I'd feel more confident if whoever proposes such a change does some
research on how piping such names into common tools and scripting
languages works (both for utf8 and non-utf8 names).

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux