Re: SIMD accelerated crush_do_rule proof of concept

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Le 29/08/2016 à 15:55, Sage Weil a écrit :
To answer your question, the only real risk/problem I see is that we need
to keep the perfectly in sync with the non-optimized variant

I do propose a generic implementation that allows to share SIMD on ARM, Intel and others (Altivec),


https://github.com/dachary/ceph/commit/71ae4584d9ed57f70aad718d0ffe206a01e91fef

You can try the following,
For instance,
#include <stdint.h>
#include <immintrin.h>
{
__v32qi va, vb;
va = (__v32qi) { 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 4, 1, 0 }; vb = (__v32qi) { 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0 };

__v32qi res = va ^ vb;
}

it will produce the optimized Neon or AVX, AVX2 according to each targets.


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux