Re: x86: faster strncpy_from_user()

Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx> · Wed, 11 Apr 2012 09:56:51 +1000

On Tue, 2012-04-10 at 16:33 -0700, H. Peter Anvin wrote:
> > Just wanted to mention that handling the detect zeroes operations on
> > cpus that require alignment is easy, just rewind the pointer at the
> > beginning to be aligned and "or" in a mask of 0xff for each alignment
> > pad byte into the initially loaded word.
> > 
> 
> Even on machines which don't require alignment it will still be faster
> to do aligned memory references only, not counting the startup cost
> (which is substantial in this case, of course, since the average length
> is so short.)  However, it also neatly avoids the page overrun problem.

I'm leaning toward that too, but I want to do some benches. The main
issues for me are:

  - I have to deal with a reasonably wide range of different cores which
will handle unaligned accesses very differently. Almost all will do it
in HW but with very varying degree of performances and some will
occasionally trap (SW emulation kicks in but that's extremely slow). The
trapping case is generally rare though, depending on the core it will
happen on things like page boundaries or segment boundaries. I also
suspect that the byte-reverse load/store instructions will suck more at
unaligned.

 - The page overrun is an issue. On 64-bit we don't have anything mapped
past the end of the linear mapping and on 32-bit we fall into ioremap
space. That's fixable with a quick hack to add one more page to the
linear mapping, creating a double mapping of either page 0 or any random
page of memory, I don't have cache aliases or anything like that to
worry about but it's gross.

Anyways, I'll try to play around if I get time, might have to wait for
next week tho, I have some more urgent stuff to sort out and I'm off
friday to tuesday.

Cheers,
Ben.

--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html