Re: ARM NEON optimisations for gf-complete/jerasure/ceph-erasure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yes, it's possible to use CPU flags to allow the use of advanced
instruction sets automatically.  The difficulty is that, if those
instructions aren't available, it's not clear which of the "basic"
approaches to use, since performance can vary based on a lot of
factors.  Even with advanced instructions, there are often multiple
reasonable approaches to take, as Janne's email makes clear, so it's
impossible to say "this algorithm is always best".

We can certainly set up a default approach if we want, though, that
can be overridden by compile-time flags.

Incidentally, I'm starting to work on coding a version of gf-complete
(and associated erasure coding functions) in C++ using templates,
which will hopefully allow us to better separate out different
implementations.  We could still have run-time dispatch for the
desired routines, but templates should allow for more compact code and
better isolation of architecture-specific code.  The big drawback is
that C++ code isn't typically used in the kernel....

ethan

On Thu, Sep 4, 2014 at 8:57 AM, Loic Dachary <loic@xxxxxxxxxxx> wrote:
> Hi Janne,
>
> On 04/09/2014 16:42, Janne Grunau wrote:
>> Hi,
>>
>> I've started writing ARM/AArch64 NEON optimizations for gf-complete.
>> http://git.jannau.net/gf-complete.git/log/?h=neon has proof of concept
>> AArch64 NEON optimisations for w8.
>>
>> Implemented methods are so far the carry-less/polynomial multiplication
>> and the split table. The polynomial multiplication is reasonable fast
>> for region multiplications (~2000MB/s on an Apple A7 at 1.3GHz) since
>> NEON has a 8-bit to 16-bit SIMD polynomial multiplication.
>>
>> The split table method is still faster though, 5700MB/s on the same CPU.
>> I'm actually surprised by that since it is faster (per cycle) than the
>> Core i7-3770 from gf-complete's manual (page 14). That suggests that
>> SSE3 code might not be optimal.
>>
>> I'm currently working on integrating NEON into the build system and then
>> will extend the existing code to work on ARMv7-a too. Those two are
>> straight forward. There are a couple of other issues I would like to
>> discuss before I start to work on them.
>>
>> The #if/#ifdefs in the source are starting to make the source hard to
>> read then more than one optimization is added. Separating arch specific
>> implementations from each other and from the generic implementation
>> works reasonable well for the multimedia related projects I have
>> experience with (libav/FFmpeg, x264). There would be arch specific init
>> functions which set the appropriate function pointers. The neon
>> optimisations would then live in w8_arm.c which would be only compiled
>> for arm. If someone has another idea how to avoid the #ifdefs I'm open
>> for that too.
>
> Would it be possible to make use of ifunc ( https://gcc.gnu.org/onlinedocs/gcc-4.7.2/gcc/Function-Attributes.html#index-g_t_0040code_007bifunc_007d-attribute-2529 ) to chose the function depending on CPU features ?
>
> http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/i386-and-x86-64-Options.html#i386-and-x86-64-Options
>
> http://www.spinics.net/lists/ceph-devel/msg18452.html
>
> Cheers
>
>> I'm currently using the SSE/NOSSE region option which is bogus. I'm
>> wondering whether I should just rename that SIMD/NOSIMD (not really true
>> since the carry less operations for w64 and w128 only use the SIMD
>> instruction set but are single data). That would need to have backward
>> compatibility for SSE/NOSSE. The other option would be to add
>> NEON/NONEON flags.
>>
>> I'm sure I find other issues to discuss when I start integrating the
>> NEON optimisations into jerasure and ceph.
>>
>> thanks
>>
>> Janne
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
> --
> Loïc Dachary, Artisan Logiciel Libre
>



-- 
( Ethan L. Miller               Email: elm@xxxxxxxxxxx            )
( Professor, Computer Science   Web: http://www.cs.ucsc.edu/~elm/ )
( University of California      Phone: +1 831 459-1222            )
( Santa Cruz, CA 95064 USA      Fax:   +1 831 459-1041            )
( PGP keyprint: 76C7 D699 1FF6 A1A4 B7A1 9629 2EBF 1273 A6ED 6A09 )
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux