Re: [PATCH v9 4/8] utf8: add function to detect a missing UTF-16/32 BOM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



lars.schneider@xxxxxxxxxxxx writes:

> +int is_missing_required_utf_bom(const char *enc, const char *data, size_t len)
> +{
> +	return (
> +	   !strcmp(enc, "UTF-16") &&
> +	   !(has_bom_prefix(data, len, utf16_be_bom, sizeof(utf16_be_bom)) ||
> +	     has_bom_prefix(data, len, utf16_le_bom, sizeof(utf16_le_bom)))
> +	) || (
> +	   !strcmp(enc, "UTF-32") &&
> +	   !(has_bom_prefix(data, len, utf32_be_bom, sizeof(utf32_be_bom)) ||
> +	     has_bom_prefix(data, len, utf32_le_bom, sizeof(utf32_le_bom)))
> +	);
> +}

These strcmp() calls seem inconsistent with the principle embodied
by utf8.c::fallback_encoding(), i.e. "be lenient to what we accept",
and make the interface uneven.  I am wondering if we also want to
complain when the user gave us "utf16" and there is no byte order
mark in the contents, for example?  Also "UTF16" or other spelling
the platform may support but this code fails to recognise will go
unchecked.

Which actually may be a feature, not a bug, to be able to bypass
this check---I dunno.

The same comment applies to the previous step.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux