> On 06 Mar 2018, at 02:23, Junio C Hamano <gitster@xxxxxxxxx> wrote: > > Lars Schneider <larsxschneider@xxxxxxxxx> writes: > >>> On 05 Mar 2018, at 22:50, Junio C Hamano <gitster@xxxxxxxxx> wrote: >>> >>> lars.schneider@xxxxxxxxxxxx writes: >>> >>>> +static int validate_encoding(const char *path, const char *enc, >>>> + const char *data, size_t len, int die_on_error) >>>> +{ >>>> + if (!memcmp("UTF-", enc, 4)) { >>> >>> Does the caller already know that enc is sufficiently long that >>> using memcmp is safe? >> >> No :-( >> >> Would you be willing to squash that in? >> >> if (strlen(enc) > 4 && !memcmp("UTF-", enc, 4)) { >> >> I deliberately used "> 4" as plain "UTF-" is not even valid. > > I'd rather not. The code does not have to even look at 6th and > later bytes in the enc[] even if it wanted to reject "UTF-" followed > by nothing, but use of strlen() forces it to look at everything. > > Stepping back, shouldn't > > if (starts_with(enc, "UTF-") > > be sufficient? If you really care about the case where "UTF-" alone > comes here, you could write > > if (starts_with(enc, "UTF-") && enc[4]) > > but I do not think "&& enc[4]" is even needed. The functions called > from this block would not consider "UTF-" alone as something valid > anyway, so with that "&& enf[4]" we would be piling more code only > for invalid/rare case. Agreed, "if (starts_with(enc, "UTF-"))" is sufficient. Can you squash that in? Thanks for pointing me to starts_with() as I forgot about this function! - Lars