Re: git on MacOSX and files with decomposed utf-8 file names

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Jan 16, 2008, at 11:38 PM, Linus Torvalds wrote:
On Wed, 16 Jan 2008, Kevin Ballard wrote:
	The only way to argue that normalization is wrong is by providing a
good reason to preserve the exact byte sequence, and so far the only reason
I've seen is to help git.

Git doesn't care. Just use the *same* sequence everywhere. Make sure
something doesn't change it. Because if something changes it, git will
track it.

The problem is that you don't control the sequence that everybody uses.

See this example:

melo@speed(~)$ uname -a
Linux speed.simplicidade.org 2.6.9-55.ELsmp #1 SMP Wed May 2 14:28:44 EDT 2007 i686 i686 i386 GNU/Linux
melo@speed(~)$ set | grep LANG
LANG=en_US.UTF-8
melo@speed(~)$ mkdir t
melo@speed(~)$ cd t
melo@speed(~/t)$ git init
Initialized empty Git repository in .git/
melo@speed(~/t)$ touch á
melo@speed(~/t)$ git-add á
melo@speed(~/t)$ git-commit -m "added a in utf8"
Created initial commit 7a473a2: added a in utf8
 0 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 "\303\241"
melo@speed(~/t)$ export LANG=en_US
melo@speed(~/t)$ touch á
melo@speed(~/t)$ ls -la
total 12
drwxrwxr-x   3 melo melo 4096 Jan 16 23:44 .
drwx--x--x  31 melo melo 4096 Jan 16 23:43 ..
-rw-rw-r--   1 melo melo    0 Jan 16 23:44 á
-rw-rw-r--   1 melo melo    0 Jan 16 23:43 á
drwxrwxr-x   8 melo melo 4096 Jan 16 23:43 .git
melo@speed(~/t)$ git-add á
melo@speed(~/t)$ git-commit -m "added a in iso-latin-1"
Created commit 4282fca: Oláx!
 0 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 "\341"

So two (simulated in this test) users who use different LANG settings will be in trouble in no time.

What I take from this conversation is that I have to specify, for each project I work on, which encoding we should use, across all users, before they start using git with files with accented chars.

The difference I see between us is that if I tell my filesystem that I want to name my file with a particular string encoded in X, users using encoding Y will be able to read it correctly. I like my filesystem to make that work for me.

Best regards,
--
Pedro Melo
Blog: http://www.simplicidade.org/notes/
XMPP ID: melo@xxxxxxxxxxxxxxxx
Use XMPP!


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux