On Thu, 2012-10-25 at 11:19 +0200, Peter Meerwald wrote: > Hello Arun, > > > I was poking around this a bit. An input of 0x3f4aaa95 after the > > multiplication with 32767.0 should result in 0x46caa8ff but tuns out to > > be 0x46caa900. Still trying to figure out why. > > I cannot follow your example, it always results in 0x46caa900 (using NEON > or not) (because I find it a bit easier to show the reasoning, this is from gdb and not a C program) (gdb) call malloc(4) $1 = (void *) 0x61c010 (gdb) call malloc(4) $2 = (void *) 0x61c030 (gdb) call malloc(4) $3 = (void *) 0x61c050 (gdb) call *(int*)$1 = 0x3f4aaa95 $4 = 1061857941 (gdb) call *(float*)$2 = 32767.0 $5 = 32767 (gdb) call *(float*)$3 = *(float*)$1 * *(float*)$2 $6 = 25940.498 (gdb) p /x *(int*)$3 $7 = 0x46caa8ff This happens on both x86 and the Pandaboard. > but I think a have good explanation: > > static void pa_sconv_s16le_to_float32ne(unsigned n, const int16_t *src, float *dst) { [...] Possibly we're talking about different things here -- I'm referring to the float -> s16le conversion. For the reverse case, it might still be worth it to take the division's performance penalty rather than lose precision, especially if it's still a decent performance win over the current code. Regards, Arun