On 03/29/2013 05:56 PM, Peter Meerwald wrote: > +static inline void mono_to_stereo_float_neon_a9(float *dst, const float *src, unsigned n) { > + int i = n& 1; > + > + __asm__ __volatile__ ( > + "movs %[n], %[n], lsr #1 \n\t" > + "beq 2f \n\t" > + > + "1: \n\t" > + "ldm %[src]!, {r4,r6} \n\t" > + "mov r5, r4 \n\t" > + "mov r7, r6 \n\t" > + "subs %[n], %[n], #1 \n\t" > + "stm %[dst]!, {r4-r7} \n\t" > + "bgt 1b \n\t" > + > + "2: \n\t" > + > + : [dst] "+r" (dst), [src] "+r" (src), [n] "+r" (n) /* output operands (or input operands that get modified) */ > + : /* input operands */ > + : "memory", "cc", "r4", "r5", "r6", "r7" /* clobber list */ > + ); Could you add a comment explaining why we have separate implementations for A8 and A9? Also, this isn't NEON code at all, is it? So perhaps the function name should be mono_to_stereo_float_generic_arm or something like that. > +static void init_remap_neon(pa_remap_t *m) { > + unsigned n_oc, n_ic; > + > + n_oc = m->o_ss->channels; > + n_ic = m->i_ss->channels; > + > + /* find some common channel remappings, fall back to full matrix operation. */ > + if (n_ic == 1&& n_oc == 2&& > + m->map_table_i[0][0] == PA_VOLUME_NORM&& m->map_table_i[1][0] == PA_VOLUME_NORM) { > + if (arm_flags& PA_CPU_ARM_CORTEX_A8) { > + m->do_remap = (pa_do_remap_func_t) remap_mono_to_stereo_neon_a8; > + pa_log_info("Using ARM NEON/A8 mono to stereo remapping"); > + } > + else { > + m->do_remap = (pa_do_remap_func_t) remap_mono_to_stereo_neon_a9; > + pa_log_info("Using ARM NEON/A9 mono to stereo remapping"); This log message is incorrect, if I'm right in that the code in remap_mono_to_stereo_neon_a9() isn't NEON code. The remapping code looks correct, thanks for this work! -- Tanu