Re: [PATCH v9 6/8] convert: check for detectable errors in UTF encodings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Lars Schneider <larsxschneider@xxxxxxxxx> writes:

>> On 05 Mar 2018, at 22:50, Junio C Hamano <gitster@xxxxxxxxx> wrote:
>> 
>> lars.schneider@xxxxxxxxxxxx writes:
>> 
>>> +static int validate_encoding(const char *path, const char *enc,
>>> +		      const char *data, size_t len, int die_on_error)
>>> +{
>>> +	if (!memcmp("UTF-", enc, 4)) {
>> 
>> Does the caller already know that enc is sufficiently long that
>> using memcmp is safe?
>
> No :-(
>
> Would you be willing to squash that in?
>
>     if (strlen(enc) > 4 && !memcmp("UTF-", enc, 4)) {
>
> I deliberately used "> 4" as plain "UTF-" is not even valid.

I'd rather not.  The code does not have to even look at 6th and
later bytes in the enc[] even if it wanted to reject "UTF-" followed
by nothing, but use of strlen() forces it to look at everything.

Stepping back, shouldn't

	if (starts_with(enc, "UTF-") 

be sufficient?  If you really care about the case where "UTF-" alone
comes here, you could write

	if (starts_with(enc, "UTF-") && enc[4])

but I do not think "&& enc[4]" is even needed.  The functions called
from this block would not consider "UTF-" alone as something valid
anyway, so with that "&& enf[4]" we would be piling more code only
for invalid/rare case.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux