On 8 April 2014 16:08, Hin-Tak Leung <htl10@xxxxxxxxxxxxxxxxxxxxx> wrote: > ------------------------------ > On Tue, Apr 8, 2014 12:56 PM BST Sergei Antonov wrote: > >>On 8 April 2014 04:19, Hin-Tak Leung <hintak.leung@xxxxxxxxx> wrote: >> >>> Strictly speaking, the maximum value of NLS_MAX_CHARSET_SIZE = 6 >>> is not attainable in the case of conversion to UTF-8, as that >>> requires the use of surrogate pairs, i.e. consuming two storage units. >> >>True that 6 is not attainable, wrong that it is with surrogate pairs. >>A surrogate pair encodes code-points from U+10000 to U+10FFFF, which >>is 4 bytes in UTF-8 (a moderate 2 per one UTF-16 code-unit). >> >>Multiplier 3 is enough for all cases of UTF-16/UCS-2 to UTF-8 conversion. > > No. The part of the commit message you skipped, specifically mentioned that > conversion to a GB18030 locale can require x4. x3 is not enough. The sentence > is just a "BTW, the value of 6 can not 'usually' happen within this...". My statement explicitly concerns UTF-8. GB18030 is not UTF-8. > I only put in UTF-8 there, because it is widely used. It is entirely possible for > a UTF16-BE -> "some-hill-billy-inbreeding-language-that-only-a-few-people-in-the-world-speak" > conversion scheme to hit 6. > > Sigh. Remember what I said earlier about arguing about words and meaning > of words not being constructive? The code does the correct thing. I could re-word > the commit message, and/or delete that paragraph, or modifying "... the use of > surrogate pairs *and other means of encoding the higher planes*, i.e. consuming > *more than* two storage units...". and you could probably go on about what > "other means of encoding" is. > > You can probably also go on about why it should be "two", > "more than two", but not "more than or equal to two". That few words > in the commit message really does not make any real difference, > and it is also quite unconstructive to argue about the meaning > of a few words, out of context. Didn't I understand the quoted text right? I understood it as this: attaining 6 requires the use of surrogate pairs. That's what I refuted. No offence. It is just a correction. And yes, IMO errors in patch description count. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html