On 09.07.18 18:33, Junio C Hamano wrote: > Beat Bolli <dev+git@xxxxxxxxx> writes: > >>>> -static const char utf16_be_bom[] = {0xFE, 0xFF}; >>>> -static const char utf16_le_bom[] = {0xFF, 0xFE}; >>>> -static const char utf32_be_bom[] = {0x00, 0x00, 0xFE, 0xFF}; >>>> -static const char utf32_le_bom[] = {0xFF, 0xFE, 0x00, 0x00}; >>>> +static const unsigned char utf16_be_bom[] = {0xFE, 0xFF}; >>>> +static const unsigned char utf16_le_bom[] = {0xFF, 0xFE}; >>>> +static const unsigned char utf32_be_bom[] = {0x00, 0x00, 0xFE, 0xFF}; >>>> +static const unsigned char utf32_le_bom[] = {0xFF, 0xFE, 0x00, 0x00}; >>> >>> An alternative approach that might be easier to read (and avoids the >>> confusion arising from our use of (signed) chars for strings pretty >>> much >>> everywhere): >>> >>> #define FE ((char)0xfe) >>> #define FF ((char)0xff) >>> >>> ... >> >> I have tried this first (without the macros, though), and thought >> it looked really ugly. That's why I chose this solution. The usage >> is pretty local and close to function has_bom_prefix(). > > I found that what you posted was already OK, as has_bom_prefix() > appears only locally in this file and that is the only thing that > cares about these foo_bom[] constants. Casting the elements in > these arrays to (char) type is also fine and not all that ugly, > I think, and between the two (but without the macro) I have no > strong preference. I wonder if writing them as '\376' and '\377' > as old timers would helps the compiler, though. > Yes, it does, as I found out in https://public-inbox.org/git/e3df2644b59b170e26b2a7c0d3978331@xxxxxxxxx/ But I prefer hex; it's closer to the usual definition of the BOM bytes. Beat