On 11/20/2013 10:56 AM, Andrea Mazzoleni wrote: > Hi, > > Yep. At present to multiply for 2^-1 I'm using in C: > > static inline uint64_t d2_64(uint64_t v) > { > uint64_t mask = v & 0x0101010101010101U; > mask = (mask << 8) - mask; > v = (v >> 1) & 0x7f7f7f7f7f7f7f7fU; > v ^= mask & 0x8e8e8e8e8e8e8e8eU; > return v; > } > > and for SSE2: > > asm volatile("movdqa %xmm2,%xmm4"); > asm volatile("pxor %xmm5,%xmm5"); > asm volatile("psllw $7,%xmm4"); > asm volatile("psrlw $1,%xmm2"); > asm volatile("pcmpgtb %xmm4,%xmm5"); > asm volatile("pand %xmm6,%xmm2"); with xmm6 == 7f7f7f7f7f7f... > asm volatile("pand %xmm3,%xmm5"); with xmm3 == 8e8e8e8e8e... > asm volatile("pxor %xmm5,%xmm2"); > > where xmm2 is the intput/output > Now, that doesn't sound like something that can get neatly meshed into the Cauchy matrix scheme, I assume. It is somewhat nice to have a scheme which is arbitrarily expandable without having to fall back to dual parity during the restripe operation. It probably also reduces the amount of code necessary. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html