On Sun, 2013-01-13 at 20:59 +0200, Tanu Kaskinen wrote: > On Sun, 2013-01-13 at 14:53 +0100, Peter Meerwald wrote: > > > > diff --git a/src/pulsecore/sconv_neon.c b/src/pulsecore/sconv_neon.c > > > > index 6fd966d..111b56f 100644 > > > > --- a/src/pulsecore/sconv_neon.c > > > > +++ b/src/pulsecore/sconv_neon.c > > > > @@ -36,16 +36,11 @@ static void pa_sconv_s16le_from_f32ne_neon(unsigned n, const float *src, int16_t > > > > "movs %[n], %[n], lsr #2 \n\t" > > > > "beq 2f \n\t" > > > > > > > > - "vdup.f32 q2, %[plusone] \n\t" > > > > - "vneg.f32 q3, q2 \n\t" > > > > - "vdup.f32 q4, %[scale] \n\t" > > > > - "vdup.u32 q5, %[mask] \n\t" > > > > + "vdup.f32 q1, %[scale] \n\t" > > > > > > > > "1: \n\t" > > > > "vld1.32 {q0}, [%[src]]! \n\t" > > > > - "vmin.f32 q0, q0, q2 \n\t" /* clamp */ > > > > - "vmax.f32 q0, q0, q3 \n\t" > > > > - "vmul.f32 q0, q0, q4 \n\t" /* scale */ > > > > + "vmul.f32 q0, q0, q1 \n\t" /* scale */ > > > > "vcvt.s32.f32 q0, q0, #16 \n\t" /* narrow */ > > > > > You removed clamping - what happens if there's need for clamping? (I'm > > > not very good at reading assembly.) > > > > vrshrn does the narrowing int32->int16 (with saturation); the comment > > should be moved one line down > > The vcvt instruction converts floating-point numbers to fixed-point > numbers, with 16 bits in the integer part and 16 bits in the fractional > part, so most of the interesting stuff happens already in vcvt. How does > vcvt handle the situation where the float doesn't fit in the 16 bits > that are reserved for the integer part? Saturation or SIGFPE, or > something else? How is NaN handled? The reference[1] that I'm using > doesn't say anything about this... > > You say that vrshrn does its thing with saturation. Since the integer > part of the fixed-point input is already 16-bits, there's not much need > for saturation. Only the rounding the fractional part can cause > overflow, so do you mean that if the rounding would cause overflow, > vrshrn uses truncation instead of rounding? (This is not specified in > the reference either.) > > [1] http://infocenter.arm.com/help/topic/com.arm.doc.dui0204j/CIHFFGJG.html You never answered these questions, and the new patch version contains the same code. "vcvt.s32.f32 q0, q0, #16" converts four floats into four 16.16 fixed-point numbers. What happens if the input is greater than INT16_MAX? -- Tanu