Re: [PATCH V2] hfsplus: fixes worst-case unicode to char conversion of file names and attributes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



------------------------------
On Tue, Apr 8, 2014 12:56 PM BST Sergei Antonov wrote:

>On 8 April 2014 04:19, Hin-Tak Leung <hintak.leung@xxxxxxxxx> wrote:
>
>> Strictly speaking, the maximum value of NLS_MAX_CHARSET_SIZE = 6
>> is not attainable in the case of conversion to UTF-8, as that
>> requires the use of surrogate pairs, i.e. consuming two storage units.
>
>True that 6 is not attainable, wrong that it is with surrogate pairs.
>A surrogate pair encodes code-points from U+10000 to U+10FFFF, which
>is 4 bytes in UTF-8 (a moderate 2 per one UTF-16 code-unit).
>
>Multiplier 3 is enough for all cases of UTF-16/UCS-2 to UTF-8 conversion.

No. The part of the commit message you skipped, specifically mentioned that
conversion to a GB18030 locale can require x4. x3 is not enough. The sentence
is just a "BTW, the value of 6 can not 'usually' happen within this...".

I only put in UTF-8 there, because it is widely used. It is entirely possible for
a UTF16-BE -> "some-hill-billy-inbreeding-language-that-only-a-few-people-in-the-world-speak"
conversion scheme to hit 6.

Sigh. Remember what I said earlier about arguing about words and meaning
of words not being constructive? The code does the correct thing. I could re-word
the commit message, and/or delete that paragraph, or modifying "... the use of
surrogate pairs *and other means of encoding the higher planes*, i.e. consuming
*more than* two storage units...". and you could probably go on about what
"other means of encoding" is.

You can probably also go on about why it should be "two",
"more than two", but not "more than or equal to two". That few words
in the commit message really does not make any real difference,
and it is also quite unconstructive to argue about the meaning
of a few words, out of context.

Hin-Tak
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux