On 01/01/2014 09:09 PM, Richard W.M. Jones wrote:
On Wed, Jan 01, 2014 at 12:21:30PM -0800, Sean Omalley wrote:
They are a problem. It is a performance issue at the very least on
=ALL= platforms. There is a cost even on Intel's platform for
alignment errors, they just fix them up in hardware so it isn't as
big of a performance hit. It might be 5 cycles instead of 20.
On Intel Sandybridge and up there is no penalty:
http://www.agner.org/optimize/blog/read.php?i=142&v=t
On earlier Intel processors it's not significant:
http://lemire.me/blog/archives/2012/05/31/data-alignment-for-speed-myth-or-reality/
On ARM without hardware fixup it's huge. Using the test in the second
link on Kirkwood (which is much more advanced and faster than standard
armv5tel, thanks to Intel's improvements before they sold the division
to Marvell), the results are:
# ./test
processing word of size 4
offset = 0
average time for offset 0 is 207.45
offset = 1
average time for offset 1 is 7511.55
offset = 2
average time for offset 2 is 7511.35
offset = 3
average time for offset 3 is 7511.55
processing word of size 8
offset = 0
average time for offset 0 is 414.8
offset = 1
average time for offset 1 is 12340.2
offset = 2
average time for offset 2 is 12338.5
offset = 3
average time for offset 3 is 12343.8
offset = 4
average time for offset 4 is 414.2
offset = 5
average time for offset 5 is 12337.5
offset = 6
average time for offset 6 is 12339.9
offset = 7
average time for offset 7 is 12337.4
That's a 36x and 29x slowdown with:
echo 2 > /proc/cpu/alignment
If you use 3 (fixup+warn) it gets an order of magnitude worse because
syslog eats all the CPU logging the warnings.
I suspect the numbers on the Pi would be similarly bad, but I don't have
one so can't test that. I'll get some numbers for an ARMv7 machine later.
Anyway, you are optimizing far too early.
It is better to optimize too early than it is to pre-emptively code in
blissful ignorance of what goes on underneath.
If there's a performance
problem, run 'perf', find out that it's caused by X where X might be
the big misalignment penalty on ARM or many other things, then fix
that.
There's no need to go on a huge crusade to fix every last mis-
alignment, because that will involve vast hours of programmer effort
for no measurable gain.
Maybe so, but that doesn't mean that past errors should be considered as
a precedent and all code henceforth should also be written without any
alignment consideration, especially considering it has the potential to
be dangerous (e.g. on hardware without alignment auto-fix up with
kernels that don't default to auto-fixing alignment).
Gordan
_______________________________________________
arm mailing list
arm@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/arm