Re: [PATCH 06/12] userfaultfd: selftest: avoid my_bcmp false positives with powerpc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 09, 2015 at 12:50:26PM +1000, Michael Ellerman wrote:
> On Tue, 2015-09-08 at 22:43 +0200, Andrea Arcangeli wrote:
> > Keep a non-zero placeholder after the count, for the my_bcmp
> > comparison of the page against the zeropage. The lockless increment
> > between 255 to 256 against a lockless my_bcmp could otherwise return
> > false positives on ppc32le.
> > 
> > Signed-off-by: Andrea Arcangeli <aarcange@xxxxxxxxxx>
> > ---
> >  tools/testing/selftests/vm/userfaultfd.c | 12 ++++++++++--
> >  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> Without groking what the code is doing, this fix makes the test pass on my
> ppc64le box.

On ppc byte reads are "more" out of order, so you can read byte 1
before byte 0. This doesn't seem to happen on x86 (even though
smp_rmb() is not a noop on x86). Now the code didn't even have an
actual compiler barrier but gcc is unlikely to unroll the loop in a
unordered way so that wasn't a problem, but this patch takes care of
gcc too.

One side does a inc++ on a long long. The other side read byte 1
before byte 0, and check if they're both zero. When the inc long long
var transitions from 255 to 256 if you read it in reverse with little
endian, you'll see 0 0 and so my_bcmp would think the page is full of
zeros.

my_bcmp checks that we didn't map a zeropage in there by mistake (like
it would happen if userfaultfd wasn't registered on the anonymous
holes). So instead of relaying on the counter to be always > 0, I
added a further word after the counter that is never changing and not
zero, so we don't have to use any memory barrier for those out of
order checks.

On a side note (feel free to skip this part as it's userland): this is
also why I couldn't use bcmp because bcmp and memcmp return false
positive zeroes if the memory changes under it. In the sse4 unrolled
loop if it finds a difference, it can't tell if it should return > or
< 0 because it's a ptest and not a cmp insn, so then it has to repeat
the memory comparison (re-reading from memory a potentially different
copy of the memory that could have become equal). Problem is it's not
restarting the unrolled loop from where it stopped it, if this final
comparison returns zero and it's not the last byte to compare. In
short glibc memcmp/bcmp can very well return 0 before comparing all
memory that it is told to compare, if the memory is changing under
memcmp/bcmp.

It'd be enough to restart the unrolled loop if the "length" isn't zero
and the last final comparison returned zero, to fix memcmp/bcmp in
glibc. It's not getting fixed because it's not a bug by the C
standard, but it makes the SIMD accellerated bcmp/memcmp in glibc
unusable for lockless fast-path comparisons as it would lead to false
positives that would degrade performance. For example if glibc memcmp
was used to build the stable tree in KSM, it would lead to superfluous
write protections. KSM is in kernel of course so it's not affected by
the memcmp glibc implementation, but similar things can happen in
userland (like I found out the hard way in the userfaultfd program).

Thanks,
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]