Mark Junker wrote:
Junio C Hamano schrieb:
I do not know how Macintosh libc implements "struc dirent", but
this approach does not work in general.
IMHO there is no need that this approach works in general because this
is a fix for MacOSX systems only. I also use d_namlen which might not be
available on other systems. But on MacOSX this works as expected.
yet you can obtain a path component longer than 256 bytes.
Apparently the library allocates longer d_name[] field than what
is shown to the user.
This is not a problem either because on MacOSX we get decomposed UTF8
and we always convert to composed UTF8. This means that the string
returned from reencode_string will always be smaller than the original
filename that had to be reencoded.
That's not true! There are strings which gets longer when a composing
normalization is applied. Please see section 3.3 of Unicode Techical
Report 36:
http://www.unicode.org/reports/tr36/
> People assume that NFC always composes, and thus is the same or
> shorter length than the original source. However, some characters
> decompose in NFC.
(NFC = Normalization Form Composing.)
U+1D160 MUSICAL SYMBOL EIGHT NOTE is given as an example with a 3x
expansion factor when encoded in UTF-8 (I don't know what it expands to;
seems odd to me.)
-hpa
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html