lars.schneider@xxxxxxxxxxxx writes: > +int is_missing_required_utf_bom(const char *enc, const char *data, size_t len) > +{ > + return ( > + !strcmp(enc, "UTF-16") && > + !(has_bom_prefix(data, len, utf16_be_bom, sizeof(utf16_be_bom)) || > + has_bom_prefix(data, len, utf16_le_bom, sizeof(utf16_le_bom))) > + ) || ( > + !strcmp(enc, "UTF-32") && > + !(has_bom_prefix(data, len, utf32_be_bom, sizeof(utf32_be_bom)) || > + has_bom_prefix(data, len, utf32_le_bom, sizeof(utf32_le_bom))) > + ); > +} These strcmp() calls seem inconsistent with the principle embodied by utf8.c::fallback_encoding(), i.e. "be lenient to what we accept", and make the interface uneven. I am wondering if we also want to complain when the user gave us "utf16" and there is no byte order mark in the contents, for example? Also "UTF16" or other spelling the platform may support but this code fails to recognise will go unchecked. Which actually may be a feature, not a bug, to be able to bypass this check---I dunno. The same comment applies to the previous step.