4x4 single-precision matrix product with SSE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello list,

I am writing an assembly function that multiplies 2 4x4 single precision
matrices. I wrote 2 versions, one using SSE the other using SSE4.1. What
surprised me is that the SSE4.1 version fails to beat the SSE version,
it is in fact slightly slower.

Is this the right place to ask for help? If anyone is interested I can
post some code which would maybe clarify the situation a bit.

If this is not the right place, please ignore me...

nick

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Kernel Newbies]     [Security]     [Linux C Programming]     [Linux for Hams]     [DCCP]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]     [Video 4 Linux]

  Powered by Linux