Re: git on MacOSX and files with decomposed utf-8 file names

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 17, 2008 at 05:24:01PM -0800, Linus Torvalds wrote:
> 
> On Fri, 18 Jan 2008, Robin Rosenberg wrote:
> 
> > It uses the local 8-bit codepage, which is not UTF-8, often some latin-inspired
> > thingy, but in Asia multi-byte encodings are used. In western Europe it is
> > Windows-1252, which is almost, but not exactly iso-8859-1. Oh, and then we
> > have the cmd prompt which has another encoding in 8-bit mode.

Yes, the default code page for the command prompt uses so-called OEM
encoding, and GUI programs uses another one, which MS calls as "ANSI"
encoding. However, if you use Cygwin, then you have ANSI encoding in
the command prompt. So, in the same command prompt window, you can have
Cygwin programs using one encoding and other window console programs
using a different encoding.

> 
> Well, if it uses a 8-bit codepage, then that means that as far as the 
> POSIX filename interface is concerned, it has nothing what-so-ever to do 
> with Unicode (ie unicode is just a totally invisible internal encoding 
> issue, not externally visible).

Some people tried to set the current code page to 65001, which is
the Microsoft code page for UTF-8. However, it seems that does not
work very well.

http://support.microsoft.com/kb/175392
http://blogs.msdn.com/michkap/archive/2006/03/13/550191.aspx

It seems to me that Win32 API functions work correctly with
UTF-8 (after all, they are just wrappers over UTF-16 functions),
but Microsoft's C library cannot handle UTF-8 (or any other
encoding that requires more than two bytes per character).

> Anybody know which one cygwin/mingw does?

There is a patch for Cygwin that adds UTF-8 support for it, however,
Cygwin maintainers do not like it, so it is not integrated. I think
Cygwin 1.7 will support UTF-8, but I have no idea how soon it will be
released.

I don't know much about mingw, but if I am not mistaken, mingw relies
on Microsoft's C library, so I suppose it uses an "OEM" code page for
console programs by default.


Dmitry
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux