Re: Float/Double compare

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Mark,

Mark Wielaard wrote:
Hi Ian,

On Sat, 2007-06-30 at 15:25 -0400, Ian Rogers wrote:
public static int compare(float x, float y)
{
   int ix = floatAsIntBits(x);
   int iy = floatAsIntBits(y);
   if (ix == iy) return 0;
   if (isNaN(x)) return 1;
   if (isNaN(y)) return -1;
   int ix_sign = ix>>31;
   int iy_sign = iy>>31;
   if (ix_sign == iy_sign) {
     return x > y ? 1 : -1;
   } else {
     return ix_sign - iy_sign;
   }
}

I am not really comfortable with writing this as is since unless the
compiler is smart you will do a multiple isNaN() and floatToIntBits()
calls even if not necessary. Could we try to do something smarter and
factor some of this out like:

// First check for NaNs
if (x != x || y != y)
  {
    int ix = floatToIntBits(x);
    int iy = floatToIntBits(y);
    if (ix == iy)
      return 0;
    if (x != x)
      return 1;
    if (y != y)
      return -1;
  }
else
  {
    // Normal cases, or positive/negative zeros
    if (x > y)
      return 1;
    if (y > x)
      return -1;

    // Either equal or positive/negative zeros
    if (x == 0 && y == 0)
      {
        int ix = floatToIntBits(x);
        int iy = floatToIntBits(y);
        int ix_sign = ix >> 31;
        int iy_sign = iy >> 31;
        return ix_sign - iy_sign;
      }

    return 0;
  }

But maybe I am not giving the compiler/jit enough credit for actually
producing such an optimized version in the first place. And this isn't
actually the same as your code since I moved the NaN checks first again.
The above assumes the NaN test (x != x) is relatively cheap since NaN is
a canonical value is java. That might be a wrong assumption. You might
be able to do something more clever with the actual ordering of floats
according to > (since if any argument is NaN or both are zero it
evaluates to false) or with the ordering of floatToIntBits, but my head
was spinning trying to keep all the corner cases right (this calls for
expanding the Mauve Float.compareTo() test).
Thanks for the feedback. I think your code may well be better. I think its good to have identified that there is at least an ~8% potential speed up on IA32 with my code, and potentially better with yours. There is of course a big dependence on what values are being compared and on the architecture. For the arrays of floats I'm seeing in the Jikes RVM a lot of the values are identical (either 0.0 or 1.0). With your code that has a path of 6 compares to get through. With mine its just one compare, but because a floating point compare of 0.0 and -0.0 will yield that they are identical I'm forced into doing it on their int bit representation to keep the 0.0 case sane.

Here's my hand wavy costing of the now 3 ways we have a doing things:

Case: x == y neither are NaN or 0.0
Newer Cost: 5 float compares
New Cost: 2 float as ints, 1 int compare
Old Cost: 5 float compares

Case: x < y and different sign
Newer Cost: 4 float compares
New Cost: 2 float as ints, 2 int compares, 2 float compares, 2 shifts, one subtract
Old Cost: 6 float compares

Case: x < y and same sign
Newer Cost: 4 float compares
New Cost: 2 float as ints, 2 int compares, 3 float compares, 2 shifts
Old Cost: 6 float compares

Case: x is NaN
Newer Cost: 2 float compares (I think you could reuse the original code which is slightly cheaper than yours)
New Cost: 2 float as ints, 1 int compare, 1 float compare
Old Cost: 2 float compares

Case: y is NaN
Newer Cost: 2 float compares (again assuming reuse of the original code)
New Cost: 2 float as ints, 1 int compare, 2 float compares
Old Cost: 2 float compares

Case: x == 0.0 and y == 0.0
Newer Cost: 2 float as ints, 6 float compares, 2 shifts, one subtract
New Cost: either 2 float as ints, 1 int compare or 2 float as ints, 2 int compares, 2 float compares, 2 shifts, one subtract Old Cost: 4 float compares, 2 float divides, 1 float subtract, 1 float to int


So I think the cases where the different ways win out are: most values are </> then the newer way has the shortest path, most values are == then the new way has the shortest path, for two zeros (common in Jikes RVM :-( ) the new way wins out, and no one really cares about NaNs :-)

I agree about the desirable properties of your code, and I think they also address Pinski's concern, that floatToIntBits on the common path seems bad. I also agree that its good to have a short < and > path. Possibly the only redeeming feature of the new code was that it made == cheap (depending on the floatToIntBits cost). So my variant of your code would be:

// First check for NaNs
if (x != x) {
 if (y != y) return 0;
 else return 1;
}
else if (y != y) {
 return -1;
}
else
 {
   // Normal cases, or positive/negative zeros
   if (x > y)
     return 1;
   if (y > x)
     return -1;

   // Either equal or positive/negative zeros
   if (x == 0 && y == 0)
     {
       int ix = floatToIntBits(x);
       int iy = floatToIntBits(y);
       int ix_sign = ix >> 31;
       int iy_sign = iy >> 31;
       return ix_sign - iy_sign;
     }

   return 0;
 }

I'll wait for some more feedback on the list before sending this in as a patch. I guess in any case we should separate this out in VMFloat/VMDouble as I can imagine for a CPU lacking an FPU you'd want another choice.

Regards,
Ian


[Index of Archives]     [Linux Kernel]     [Linux Cryptography]     [Fedora]     [Fedora Directory]     [Red Hat Development]

  Powered by Linux