Optimize i2f()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Looking through the kernel radeon drm source, it looks like the i2f() functions in r600_blit.c and r600_blit_ksm() can be optimized a bit.

The following extends the range to all unsigned 32bit integers, and avoids the slow loop by using the bsr instruction via __fls().  It provides an exact 1-1 correspondence up to 2^24.  Above that, there is the inevitable rounding.  This routine rounds towards zero (truncation).

/* 23 bits of float fractional data */
#define I2F_FRAC_BITS 23
#define I2F_MASK ((1 << I2F_FRAC_BITS) - 1)

/*
 * Converts an unsigned integer into 32-bit IEEE floating point representation.
 * Will be exact from 0 to 2^24.  Above that, we round towards zero
 * as the fractional bits will not fit in a float.  (It would be better to
 * round towards even as the fpu does, but that is slower.)
 * This routine depends on the mod(32) behaviour of the rotate instructions
 * on x86.
 */
uint32_t i2f(uint32_t x)
{
uint32_t msb, exponent, fraction;

/* Zero is special */
if (!x) return 0;

/* Get location of the most significant bit */
msb = __fls(x);

/*
* Use a rotate instead of a shift because that works both leftwards
* and rightwards due to the mod(32) beahviour.  This means we don't
* need to check to see if we are above 2^24 or not.
*/
fraction = ror32(x, msb - I2F_FRAC_BITS) & I2F_MASK;
exponent = (127 + msb) << I2F_FRAC_BITS;

return fraction + exponent;
}

Steven Fuerst
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux