René Scharfe <l.s.r@xxxxxx> writes: >> In the previous submission, I just moved isblank() and isgraph() as >> implemented in wildmatch.c. I knew they were not robust against the >> pointer increment like isblank(*s++), but I thought it was the same >> pattern as isprint(), which has the same issue. Unfortunately, it was >> more controversial than I had expected... > > Not sure we need that story in the commit message,... Usually we encourage people to write as if there were no "previous iterations". Describe how to reach the best end-result without any detour, using the experience gained from failed attempts. > ... but it gave me an idea: ;-) > ... So it's not so > easy, however, ... > >> This version implements them as inline functions because we ran out >> all bits in the sane_ctype[] table. This is the same pattern as >> islower() and isupper(). > > ... if you remove GIT_SPACE from the definition above you get a > macro version of isgraph() that uses a single table lookup. > > If we're out of bits then isblank() is a good choice to implement > without a table lookup, as this class only contains two characters > and two comparisons should be quite fast. Implementing sane_isblank() plus sane_isgraph() as static inlines is already not too bad, but if one of them can be made into a simple table look-up, that indeed is better ;-) > Stepping back a bit: Is using the unlocalized is* macros in > wildmatch() safe, i.e. do we get the same results as before > regardless of locale? Junio's remark in > https://lore.kernel.org/git/xmqq3579crsd.fsf@gitster.g/ sounds > convincing to me if we don't care about single-byte code pages > and require plain ASCII or UTF-8. I think it's a good idea to > address that point in the commit message. True. To be honest, I wasn't thinking about non-UTF-8 single-byte "code pages". In the context of wildmatch, the strings we deal with mostly interact with the paths on the filesystem. If you interact with your filesystem in latin-1 that may pose an issue, and even if (or rather, especially if) we are to declare that it does not matter, we should document it in the proposed log message. > > And adding tests to t/helper/test-ctype.c would be nice. Very true. >> +#define isblank(x) sane_isblank(x) >> +#define isgraph(x) sane_isgraph(x) >> @@ -1270,6 +1274,16 @@ static inline int sane_iscase(int x, int is_lower) >> return (x & 0x20) == 0; >> } >> >> +static inline int sane_isblank(int c) >> +{ >> + return c == ' ' || c == '\t'; >> +} >> + >> +static inline int sane_isgraph(int c) >> +{ >> + return isprint(c) && !isspace(c); >> +}