On Tue, 2012-07-24 at 10:20 +0200, Peter Meerwald wrote: > From: Peter Meerwald <p.meerwald at bct-electronic.com> > > v3: > * convert from intrinsics to inline assembly > v2: > * load and store data with vld1/vld1q and vst1/vst1q, resp., to work > around alignment issues of compiler-generated vldmia instruction > * remove redundant check for NEON flags > > Ubuntu/Linaro gcc 4.6.3 > arm-linux-gnueabi-gcc -O2 -mcpu=cortex-a8 -mfloat-abi=softfp -mfpu=neon > > runtime on beagle-xm: > > D: [pulseaudio] sconv_neon.c: checking NEON sconv_s16le_from_float > I: [pulseaudio] sconv_neon.c: NEON: 3754 usec. > I: [pulseaudio] sconv_neon.c: ref: 58594 usec. > D: [pulseaudio] sconv_neon.c: checking NEON sconv_s16le_to_float > I: [pulseaudio] sconv_neon.c: NEON: 1831 usec. > I: [pulseaudio] sconv_neon.c: ref: 10528 usec. > I: [pulseaudio] sconv_neon.c: Initialising ARM NEON optimized conversions. > > conversion may be off by one for some samples due to rounding issues > > Signed-off-by: Peter Meerwald <p.meerwald at bct-electronic.com> > --- Just so the outcome is archived on the list, I've pushed out this patch with some fixes for building the NEON code based on the availability of compiler support, and only actually using the code at run-time based on processor features. Also pushed a rounding fix on top of this code. Thanks! Arun