On Jan 16, 2008, at 11:38 PM, Linus Torvalds wrote:
On Wed, 16 Jan 2008, Kevin Ballard wrote:
The only way to argue that normalization is wrong is by providing a
good reason to preserve the exact byte sequence, and so far the
only reason
I've seen is to help git.
Git doesn't care. Just use the *same* sequence everywhere. Make sure
something doesn't change it. Because if something changes it, git will
track it.
The problem is that you don't control the sequence that everybody uses.
See this example:
melo@speed(~)$ uname -a
Linux speed.simplicidade.org 2.6.9-55.ELsmp #1 SMP Wed May 2 14:28:44
EDT 2007 i686 i686 i386 GNU/Linux
melo@speed(~)$ set | grep LANG
LANG=en_US.UTF-8
melo@speed(~)$ mkdir t
melo@speed(~)$ cd t
melo@speed(~/t)$ git init
Initialized empty Git repository in .git/
melo@speed(~/t)$ touch á
melo@speed(~/t)$ git-add á
melo@speed(~/t)$ git-commit -m "added a in utf8"
Created initial commit 7a473a2: added a in utf8
0 files changed, 0 insertions(+), 0 deletions(-)
create mode 100644 "\303\241"
melo@speed(~/t)$ export LANG=en_US
melo@speed(~/t)$ touch á
melo@speed(~/t)$ ls -la
total 12
drwxrwxr-x 3 melo melo 4096 Jan 16 23:44 .
drwx--x--x 31 melo melo 4096 Jan 16 23:43 ..
-rw-rw-r-- 1 melo melo 0 Jan 16 23:44 á
-rw-rw-r-- 1 melo melo 0 Jan 16 23:43 á
drwxrwxr-x 8 melo melo 4096 Jan 16 23:43 .git
melo@speed(~/t)$ git-add á
melo@speed(~/t)$ git-commit -m "added a in iso-latin-1"
Created commit 4282fca: Oláx!
0 files changed, 0 insertions(+), 0 deletions(-)
create mode 100644 "\341"
So two (simulated in this test) users who use different LANG settings
will be in trouble in no time.
What I take from this conversation is that I have to specify, for
each project I work on, which encoding we should use, across all
users, before they start using git with files with accented chars.
The difference I see between us is that if I tell my filesystem that
I want to name my file with a particular string encoded in X, users
using encoding Y will be able to read it correctly. I like my
filesystem to make that work for me.
Best regards,
--
Pedro Melo
Blog: http://www.simplicidade.org/notes/
XMPP ID: melo@xxxxxxxxxxxxxxxx
Use XMPP!
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html