Re: [RFC PATCH 6/6] utf8.c: avoid char overflow

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Beat Bolli <dev+git@xxxxxxxxx> writes:

>>> -static const char utf16_be_bom[] = {0xFE, 0xFF};
>>> -static const char utf16_le_bom[] = {0xFF, 0xFE};
>>> -static const char utf32_be_bom[] = {0x00, 0x00, 0xFE, 0xFF};
>>> -static const char utf32_le_bom[] = {0xFF, 0xFE, 0x00, 0x00};
>>> +static const unsigned char utf16_be_bom[] = {0xFE, 0xFF};
>>> +static const unsigned char utf16_le_bom[] = {0xFF, 0xFE};
>>> +static const unsigned char utf32_be_bom[] = {0x00, 0x00, 0xFE, 0xFF};
>>> +static const unsigned char utf32_le_bom[] = {0xFF, 0xFE, 0x00, 0x00};
>>
>> An alternative approach that might be easier to read (and avoids the
>> confusion arising from our use of (signed) chars for strings pretty
>> much
>> everywhere):
>>
>> #define FE ((char)0xfe)
>> #define FF ((char)0xff)
>>
>> ...
>
> I have tried this first (without the macros, though), and thought
> it looked really ugly. That's why I chose this solution. The usage
> is pretty local and close to function has_bom_prefix().

I found that what you posted was already OK, as has_bom_prefix()
appears only locally in this file and that is the only thing that
cares about these foo_bom[] constants.  Casting the elements in
these arrays to (char) type is also fine and not all that ugly,
I think, and between the two (but without the macro) I have no
strong preference.  I wonder if writing them as '\376' and '\377'
as old timers would helps the compiler, though.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux