Re: Mostly portable strnlen_user()

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Fri, 25 May 2012 19:13:31 -0700

On Fri, May 25, 2012 at 6:09 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> I can do that. Give me a few minutes. Or more. Let's see how it ends up..

Ugh. So this was not at all as trivial as I thought it would be.

One issue is just syntactic: the fact that you want to initialize
those constants just results in nasty syntax. Is gcc really so bad on
sparc that it doesn't do the obvious CSE etc on the constants? That
surprises me, because it does it on x86-64..

But the more annoying issue is that the dentry code really does depend
on the fact that we can just combine the masks for two values (the
zero bytes and the '/' bytes) and test and munge them together. And
that's simply not true of your big-endian version.

I think that instead of the "find_zero()" function, we need a
"generate_mask()" function that actually generates the reliable mask
of "these bytes have a bit set if the byte was zero". That's the

  unsigned long mask = (c & p->low_bits) + p->low_bits;
  mask = (mask | p->rhs);

phase of the big-endian logic (and would be a no-op on little endian).

So it looks like we'd really want to have three phases:

 - the "quick check if it has a zero" (and save that value away as the
preliminary mask)

   This is what the loop generates and tests end on, and could
(obviously) be or'ed together in the *test* to check two cases at the
same time.

      preliminary = has_zero(val);

 - the "reliable mask" that is generated from the quick mask and the last value

   This would be a "prep" function right after the loop exit, and it
could be or'ed together to have just one value for the final phase:

        reliable = prep_zero_mask(val, preliminary);

 - the find zero byte (and for some users, find "mask" that goes with
it) from the single value generated above.

        offset = find_zero(reliable);

But quite frankly, your three-value struct makes this much uglier than
I think it would need to be. That's *especially* true since the
constants should be shared, even if you have two different values
going on concurrently for the dentry case (the NUL and '/' values).

Because if I read the sparc code correctly, it really never needs more
than one single word at any time. The only reason you have that opaque
data structure is that the other two words are those constants that
you preload, and that really the compiler *should* be able to CSE just
fine. Do you *really* need to CSE them by hand?

I think we could get rid of that structure entirely, and just have
everybody use a single "unsigned long". If you just used the constants
directly in the code and could rely on the compiler. Hmm?

So how bad does the code look on sparc if you just get rid of the
"low_bits" and "high_bits" and just replace them with the constants
they are?

                        Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html