[PATCH 2/2] core: Add ARM NEON optimized sample conversion code

pmeerw@xxxxxxxxxx (Peter Meerwald) · Thu, 25 Oct 2012 14:04:45 +0200 (CEST)



Hello,

> > > I was poking around this a bit. An input of 0x3f4aaa95 after the
> > > multiplication with 32767.0 should result in 0x46caa8ff but tuns out to
> > > be 0x46caa900. Still trying to figure out why.
> > 
> > I cannot follow your example, it always results in 0x46caa900 (using NEON 
> > or not)

> (gdb) call *(float*)$3 = *(float*)$1 * *(float*)$2
> $6 = 25940.498
> (gdb) p /x *(int*)$3
> $7 = 0x46caa8ff
> This happens on both x86 and the Pandaboard.

I am curious what the correct result is for that multiplication;
the GDB result is different from actual C code

looking at gdb's valarith.c function scalar_binop(), it appears gdb 
performs multiplications (BINOP_MUL) as double values -- not entirely 
sure, but I wouldn't trust gdb to produce 'accurate' results

> Possibly we're talking about different things here -- I'm referring to
> the float -> s16le conversion.

I've never seen deviation between NEON and C code for float -> s16le
my claim for one-off was for s16le -> float only

> For the reverse case, it might still be worth it to take the division's
> performance penalty rather than lose precision, especially if it's still
> a decent performance win over the current code.

float division is implementing with successive approximation in NEON; I 
doubt that it will give exact results

p.

-- 

Peter Meerwald
+43-664-2444418 (mobile)