On 16. jan.. 2008, at 17.32, Johannes Schindelin wrote:
FWIW the issue is that Mac OS X decides that it knows better how to
encode your filename than you could yourself.
More like, Mac OS X has standardized on Unicode and the rest of the
world hasn't caught up yet. Git is the only tool I've ever heard of
that
has a problem with OS X using Unicode.
No. That's not at all the problem. Mac OS X insists on storing
_another_
encoding of your filename. Both are UTF-8. Both encode the _same_
string. Yet they are different, bytewise. For no good reason.
Stop spreading FUD. Git can handle Unicode just fine. In fact, Git
does
not _care_ how the filename is encoded, it _respects_ the user's
choice,
not only of the encoding _type_, but the _encoding_, too.
"FUD" is a bit strong, don't you think? HFS+ is the way it is and it
would be nice if Git could deal with it.
The problem is that HFS+ normalizes filenames to avoid multiple files
that appear to have the same name (eg "M<A WITH UMLAUT>rchen" vs
"Ma<UMLAUT MODIFIER>rchen", in gitweb/test). This is sort of like
case sensitivity, but filenames are normalized when a file is
_created_. Git, not unreasonably, expects a file to keep the name it
was created with.
As far as I can tell, as long as you add all your internationally
becharactered files to git from an HFS+ file system using a gui or
command-line completion, you'll be okay; trouble starts when you check
in a file with the composed form of a character, by typing the name on
the command line (I'm not sure about this one) or committing on
another OS. Git will store the filename in composed form, but the
Mac's filesystem will decompose the filename when you check the file
out.
The result looks like this:
vredefort:[git]% git status
# On branch master
# Untracked files:
# (use "git add <file>..." to include in what will be committed)
#
# gitweb/test/Märchen
nothing added to commit but untracked files present (use "git add" to
track)
(this is directly after checking out git.git @ v1.5.4-rc3)
There are two things to note here. One is that Git thinks that there
is a new file called "gitweb/test/Märchen" (decomposed) when it's
"really" just the same "gitweb/test/Märchen" (precomposed) that's in
the repository. The other is that git _thinks_ that the "gitweb/test/
Märchen" (precomposed) it's expecting is still there, because the
filesystem, when asked for "gitweb/test/Märchen" in any form will
return the file "gitweb/test/Märchen" (decomposed).
Trying to check out the "next" branch at this point is a pain since
next's "Märchen" would overwrite the untracked "Märchen".
I can't provide links to any previous discussions about this, but
here's Apple's Technical Q&A on the subject:
http://developer.apple.com/qa/qa2001/qa1235.html
Finding a sane way of allowing git to handle this behaviour is left as
an exercise for the reader.
Eyvind Bernhardsen
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html