Re: ARM NEON optimisations for gf-complete/jerasure/ceph-erasure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 2014-09-04 17:27:16 -0700, Ethan L. Miller wrote:
> Yes, it's possible to use CPU flags to allow the use of advanced
> instruction sets automatically.

runtime detection of supported instructions sets is with the current 
function pointer approach possible too.

>  The difficulty is that, if those
> instructions aren't available, it's not clear which of the "basic"
> approaches to use, since performance can vary based on a lot of
> factors.  Even with advanced instructions, there are often multiple
> reasonable approaches to take, as Janne's email makes clear, so it's
> impossible to say "this algorithm is always best".

I agree that the current approach fits the model of implementations with 
different cpu/memory use better. Using ifunc would be mostly orthogonal 
to the issue of badly structured code.
 
> We can certainly set up a default approach if we want, though, that
> can be overridden by compile-time flags.

I don't think this would be an improvement.

> Incidentally, I'm starting to work on coding a version of gf-complete
> (and associated erasure coding functions) in C++ using templates,
> which will hopefully allow us to better separate out different
> implementations.  We could still have run-time dispatch for the
> desired routines, but templates should allow for more compact code and
> better isolation of architecture-specific code.  The big drawback is
> that C++ code isn't typically used in the kernel....

One possible simplification for the carry less multiplication would be 
relying on inlining and optimisations of compile time constants.  

Implement one function which does a variable number of polynomial 
reductions. The current functions would then just be thin wrappers which 
call the general function with a compile time constant for the number of 
reductions. Forced inlining and dead code removal will optimize branches 
away. The same method could be used to avoid the duplication of the 
inner loop for the optional xor with the destination.

Janne
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux