Lars Schneider <larsxschneider@xxxxxxxxx> writes: >> On 05 Mar 2018, at 22:50, Junio C Hamano <gitster@xxxxxxxxx> wrote: >> >> lars.schneider@xxxxxxxxxxxx writes: >> >>> +static int validate_encoding(const char *path, const char *enc, >>> + const char *data, size_t len, int die_on_error) >>> +{ >>> + if (!memcmp("UTF-", enc, 4)) { >> >> Does the caller already know that enc is sufficiently long that >> using memcmp is safe? > > No :-( > > Would you be willing to squash that in? > > if (strlen(enc) > 4 && !memcmp("UTF-", enc, 4)) { > > I deliberately used "> 4" as plain "UTF-" is not even valid. I'd rather not. The code does not have to even look at 6th and later bytes in the enc[] even if it wanted to reject "UTF-" followed by nothing, but use of strlen() forces it to look at everything. Stepping back, shouldn't if (starts_with(enc, "UTF-") be sufficient? If you really care about the case where "UTF-" alone comes here, you could write if (starts_with(enc, "UTF-") && enc[4]) but I do not think "&& enc[4]" is even needed. The functions called from this block would not consider "UTF-" alone as something valid anyway, so with that "&& enf[4]" we would be piling more code only for invalid/rare case.