Re: Arch maintainers Ahoy!

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Wed, 23 May 2012 14:01:26 -0700

On Wed, May 23, 2012 at 1:36 PM, David Miller <davem@xxxxxxxxxxxxx> wrote:
>
> I toyed around with some of the ideas we discussed but gcc really
> mishandled all the approaches I tried.

Have you tried coding them as ?: expressions, along with making all
the temporaries separate variables? Sometimes that seems to make gcc
more eager to use cmov's.

Although that seemed to work better before. These days gcc sometimes
seems so eager to show it knows better than the programmer that it is
hard to make it do the obvious thing from the obvious source code..

> 1) In the loop, use the test:
>
>      (x + 0xfefefeff) & ~(x | 0x7f7f7f7f)
>
>   It's the same effective cost as the current test (on sparc
>   it would be ADD, OR, ANDNCC).
>
>   We make sure to calculate the "x | 0x7f7f7f7f" part into
>   a variable which is not clobbered by the rest of the test.
>
>   This is so we can reuse it in #2.
>
> 2) Once we find a word containing the zero byte, do a:
>
>        ~(((x & 0x7f7f7f7f) + 0x7f7f7f7f) | x | 0x7f7f7f7f)
>
>   and that "x | 0x7f7f7f7f" part is already calculated and thus
>   can be cribbed the place we left it in #1 above.
>
>   And now we'll have exactly a 0x80 where there is a zero byte,
>   and no bleeding of 0x80 values into adjacent byte positions.
>
> Once we have that we can just test that mask directly for the
> zero byte location search code.

Sounds likely, and you only have two different constants to worry about.

Sadly, I don't see any way to get the "only high bits set" cheaply,
like the little-endian case does (ie going from "zero in second byte
and after": 0x00808080 to the byte mask you need: 0xff000000). If you
had that, and the appropriate unaligneds, you'd also have everything
for the dcache case, not just strncpy.

                  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html