Le 29/08/2016 à 15:55, Sage Weil a écrit :
To answer your question, the only real risk/problem I see is that we need
to keep the perfectly in sync with the non-optimized variant
I do propose a generic implementation that allows to share SIMD on ARM,
Intel and others (Altivec),
https://github.com/dachary/ceph/commit/71ae4584d9ed57f70aad718d0ffe206a01e91fef
You can try the following,
For instance,
#include <stdint.h>
#include <immintrin.h>
{
__v32qi va, vb;
va = (__v32qi) { 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18,
17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 4, 1, 0 };
vb = (__v32qi) { 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18,
17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0 };
__v32qi res = va ^ vb;
}
it will produce the optimized Neon or AVX, AVX2 according to each targets.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html