Re: [PATCH] use strpbrk(3) to search for characters from a given set

Jeff King <peff@xxxxxxxx> · Mon, 24 Feb 2020 01:38:02 -0500

On Sat, Feb 22, 2020 at 07:51:19PM +0100, René Scharfe wrote:

> We can check if certain characters are present in a string by calling
> strchr(3) on each of them, or we can pass them all to a single
> strpbrk(3) call.  The latter is shorter, less repetitive and slightly
> more efficient, so let's do that instead.

I think the resulting code is more readable, and this is worth doing on
those grounds alone.

I was curious whether it's always more efficient (tldr, it is; read on
only if you're also curious). The default glibc implementation of
strpbrk() is built on strcspn(). That in turn creates a 256-byte table
with a boolean flag set for each char in the accept string, and then
walks the haystack string doing O(1) lookups in the table for a hit. So
we have to memset() the table to zeroes at the start. Depending on the
length of the input string, that could be more expensive than just
walking the haystack string three times (one for each char in the accept
string).

But of course in practice there's all kinds of weird SSE assembly going
on behind the scenes. So in most of my tests, strpbrk() beats strchr()
handily. The one exception is when the first strchr() matches early in
the string and can short-circuit the rest of them.

The test I used was:

  $ cat foo.c
  #include <string.h>

  int check(const char *s)
  {
  #if USE_STRCHR
  	return strchr(s, '<') || strchr(s, '>') || strchr(s, '@');
  #else
  	return !!strpbrk(s, "@<>");
  #endif
  }

  $ cat bar.c 
  int check(const char *s);

  int main(int argc, const char **argv)
  {
  	int ret;
  	int i;

  	for (i = 0; i < 100000000; i++)
  		ret += check(argv[1]);
  	return ret & 0x7f;
  }

I put it into two separate files to avoid check() being hoisted out of
the loop. There's something a bit interesting, though. If you combine
those two files, gcc _does_ hoist the strpbrk() version, generating
this as the complete code:

	  subq    $8, %rsp
          .cfi_def_cfa_offset 16
          movq    8(%rsi), %rdi
          leaq    .LC0(%rip), %rsi
          call    strpbrk@PLT
          testq   %rax, %rax
          setne   %al
          addq    $8, %rsp
          .cfi_def_cfa_offset 8
          movzbl  %al, %eax
          movl    %eax, %edx
          imull   $99999999, %eax, %eax
          addl    %edx, %eax
          andl    $127, %eax
          ret

So it calls strpbrk() exactly once, then recognizes that the loop is
just adding up the same variable over and over and turns it into a
multiply. No loop at all; very cool (though I am puzzled why it doesn't
multiply by the full loop count; instead if multiplies by one less and
then adds back in one iteration).

But with strchr() it was less willing to do that (curiously, it will if
the function _isn't_ static, but won't if it is).

Anyway, that's all compiler arcana subject to change between versions,
and it's a dumb toy example anyway. But it does seem like the single
call is more likely to get inlined or hoisted by the compiler, which is
a point in its favor.

-Peff