Re: [PATCH 1/5] git-compat-util: add isblank() and isgraph()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 11.02.23 um 14:48 schrieb René Scharfe:
> Am 11.02.23 um 08:01 schrieb Masahiro Yamada:
>> On Sat, Feb 11, 2023 at 7:03 AM René Scharfe <l.s.r@xxxxxx> wrote:
>>>
>>> We already use eight bits for the lookup table values in sane_ctype.
>>> Perhaps it's time to move GIT_GLOB_SPECIAL, GIT_REGEX_SPECIAL,
>>> GIT_PATHSPEC_MAGIC and GIT_PUNCT to their own table to make room for
>>> flags for isprint(), isgraph() and isblank().
>>
>>
>>
>> I think that is a good idea.
>>
>> If we can have more room for isupper() and islower(),
>> we will be able to delete sane_iscase().
>>
>>
>> (I was thinking 'unsigned short', but having two tables is OK.)
>>
>>
>>
>> So, how can I proceed with this patchset?
>>
>>
>>
>> [A] After somebody refactors ctype.c table,
>>     I will rebase this series on top of that.
>>
>> [B] keep isblank() and isgraph() local to wildmatch.c
>>     until that happens, so I can proceed without
>>     depending on the ctype.c refactoring.
>>
>>     Apparently, wildmatch.c is not using a pointer
>>     increment with isblank() or isgraph().
>>
>> [C] If 'somebody' in [A] is supposed to me,
>>     my v2 will include ctype refactoring.
>
> [D] We need more tests first. :)  I sent patches to test the current
> classifiers separately.  A similar test for isblank would have helped
> you detect the mismatch quickly.  Full test coverage also gives
> confidence when tinkering with the existing classifiers.
>
> 1c149ab2dd (ctype: support iscntrl, ispunct, isxdigit and isprint,
> 2012-10-15) added an implementation of isprint that evaluated its
> argument only once, by the way, but the one from 0fcec2ce54
> (format-patch: make rfc2047 encoding more strict, 2012-10-18)
> replaced it.
>
> Widening the element type of the lookup table would work, but might
> impact performance.  I guess it would be slower, but perhaps we'd
> have to measure it to be sure.  Splitting the table into unrelated
> subsets would avoid that.
>
> Deciding which flags to move requires knowing the full target set,
> I think.  The punctuation-related flags looked to me like good
> candidates until I saw the original isprint implementation which
> uses them and several of the others.  So I'm not so sure anymore,
> but here's a patch that moves them out:

Perhaps:

[E] Implement isblank and isgraph as inline functions and leave the
lookup table integration for later:

   #undef isblank
   #define isblank(x) sane_isblank(x)
   static inline int sane_isblank(int c) {return c == ' ' || c == '\t';}

   #undef isgraph
   #define isgraph(x) sane_isgraph(x)
   static inline int sane_isgraph(int c) {return isprint(c) && c != ' ';}

The Lazy Way™ ;)

René




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux