Phillip Wood <phillip.wood123@xxxxxxxxx> 于2021年5月29日周六 下午9:23写道: > > On 27/05/2021 17:36, Felipe Contreras wrote: > > ZheNing Hu via GitGitGadget wrote: > > [...] > >> +static int memcasecmp(const void *vs1, const void *vs2, size_t n) > > > > Why void *? We can delcare as char *. > > If you look at how this function is used you'll see > int (*cmp_fn)(const void *, const void *, size_t); > cmp_fn = s->sort_flags & REF_SORTING_ICASE > ? memcasecmp : memcmp; > > So the signature must match memcmp to avoid undefined behavior (a > ternary expression is undefined unless both sides evaluate to the same > type and calling a function through a pointer a different type is > undefined as well) > I agree. > >> +{ > >> + size_t i; > >> + const char *s1 = (const char *)vs1; > >> + const char *s2 = (const char *)vs2; > > > > Then we avoid this extra step. > > > >> + for (i = 0; i < n; i++) { > >> + unsigned char u1 = s1[i]; > >> + unsigned char u2 = s2[i]; > > > > There's no need for two entirely new variables... > > > >> + int U1 = toupper (u1); > >> + int U2 = toupper (u2); > > > > You can do toupper(s1[i]) directly (BTW, there's an extra space: `foo(x)`, > > not `foo (x)`). > > > > While we are at it, why keep an extra index from s1, when s1 is never > > used again? > > > > We can simply advance both s1 and s2: > > > > s1++, s2++ > > > >> + int diff = (UCHAR_MAX <= INT_MAX ? U1 - U2 > >> + : U1 < U2 ? -1 : U2 < U1); > > > > I don't understand what this is supposed to achieve. Both U1 and U2 are > > integers, pretty low integers actually. > > > > If we get rid if that complexity we don't even need U1 or U2, just do: > > > > diff = toupper(u1) - toupper(u2); > > > >> + if (diff) > >> + return diff; > >> + } > >> + return 0; > >> +} > > > > All we have to do is define the end point, and then we don't need i: > > > > static int memcasecmp(const char *s1, const char *s2, size_t n) > > { > > const char *end = s1 + n; > > for (; s1 < end; s1++, s2++) { > > int diff = tolower(*s1) - tolower(*s2); > > if (diff) > > return diff; > > } > > return 0; > > } > > > > (and I personally prefer lower to upper) > > We should be using tolower() as that is what POSIX specifies for > strcasecmp() [1] which we are trying to emulate and there are cases[2] where > (tolower(c1) == tolower(c2)) != (toupper(c1) == toupper(c2)) > I don’t know if we overlooked a fact: This static `memcasecmp()` is not a POSIX version. `tolower()` or `toupper()` are in git-compat-util.h, sane_istest('\0', GIT_ALPHA) == false . So in `sane_case()`, whatever `tolower()`, `toupper()`, they just return '\0' itself. > Best Wishes > > Phillip > > [1] https://pubs.opengroup.org/onlinepubs/9699919799/ > [2] https://en.wikipedia.org/wiki/Dotted_and_dotless_I#In_computing > Thanks. -- ZhenNing Hu