> On 06 Mar 2018, at 21:50, Junio C Hamano <gitster@xxxxxxxxx> wrote: > > lars.schneider@xxxxxxxxxxxx writes: > >> +int is_missing_required_utf_bom(const char *enc, const char *data, size_t len) >> +{ >> + return ( >> + !strcmp(enc, "UTF-16") && >> + !(has_bom_prefix(data, len, utf16_be_bom, sizeof(utf16_be_bom)) || >> + has_bom_prefix(data, len, utf16_le_bom, sizeof(utf16_le_bom))) >> + ) || ( >> + !strcmp(enc, "UTF-32") && >> + !(has_bom_prefix(data, len, utf32_be_bom, sizeof(utf32_be_bom)) || >> + has_bom_prefix(data, len, utf32_le_bom, sizeof(utf32_le_bom))) >> + ); >> +} > > These strcmp() calls seem inconsistent with the principle embodied > by utf8.c::fallback_encoding(), i.e. "be lenient to what we accept", > and make the interface uneven. I am wondering if we also want to > complain when the user gave us "utf16" and there is no byte order > mark in the contents, for example? Well, if I use stricmp() then I don't need to call and cleanup xstrdup_toupper() as discussed with Eric [1]. Is there a case insensitive starts_with() method? [1] https://public-inbox.org/git/CAPig+cQE0pKs-AMvh4GndyCXBMnx=70jPpDM6K4jJTe-74FecQ@xxxxxxxxxxxxxx/ > Also "UTF16" or other spelling > the platform may support but this code fails to recognise will go > unchecked. That is true. However, I would assume all iconv implementations use the same encoding names for UTF encodings, no? That means UTF16 would never be valid. Would you agree? - Lars