On Thu, Mar 15, 2018 at 7:35 PM, Lars Schneider <larsxschneider@xxxxxxxxx> wrote: >> On 16 Mar 2018, at 00:25, Eric Sunshine <sunshine@xxxxxxxxxxxxxx> wrote: >>> if (is_encoding_utf8(src) && is_encoding_utf8(dst)) >>> return 1; >>> + if (same_utf_encoding(src, dst)) >>> + return 1; >>> return !strcasecmp(src, dst); >>> } >> >> This seems odd. I would have expected the newly-added generalized >> conditional to replace the original UTF-8-specific conditional, not >> supplement it. That is, shouldn't the entire function body be: >> >> if (same_utf_encoding(src, dst)) >> return 1; >> return !strcasecmp(src, dst); > > No, because is_encoding_utf8() returns "true" (=1) if the encoding > is NULL. That is not the case for UTF-16 et al. The caller of > same_encoding() might expect that behavior. > I could have moved the "UTF-8" == NULL assumption into > same_utf_encoding() but that did not feel right. > Does this make sense? Not particularly. Looking at 677cfed56a (commit-tree: cope with different ways "utf-8" can be spelled., 2006-12-30), which introduced is_encoding_utf8() along with its builtin NULL-check, I can see why that function wants to treat NULL the same as UTF-8. However, I'm having a tough time imagining cases in which callers would want same_encoding() to return true if both arguments are NULL, but outright crash if only one is NULL (which is the behavior even before this patch). In other words, same_encoding() takes advantage of is_encoding_utf8() for its convenience, not for its NULL-handling. Given that view, the two explicit is_encoding_utf8() calls in same_encoding() seem redundant once the same_utf_encoding() call is added.