Re: Float/Double compare

Ian Rogers <ian.rogers@xxxxxxxxxxxxxxxx> · Sun, 01 Jul 2007 09:11:27 -0400

Hi Mark,

Mark Wielaard wrote:
Hi Ian,

On Sat, 2007-06-30 at 15:25 -0400, Ian Rogers wrote:

public static int compare(float x, float y)
{
   int ix = floatAsIntBits(x);
   int iy = floatAsIntBits(y);
   if (ix == iy) return 0;
   if (isNaN(x)) return 1;
   if (isNaN(y)) return -1;
   int ix_sign = ix>>31;
   int iy_sign = iy>>31;
   if (ix_sign == iy_sign) {
     return x > y ? 1 : -1;
   } else {
     return ix_sign - iy_sign;
   }
}

I am not really comfortable with writing this as is since unless the
compiler is smart you will do a multiple isNaN() and floatToIntBits()
calls even if not necessary. Could we try to do something smarter and
factor some of this out like:

// First check for NaNs
if (x != x || y != y)
  {
    int ix = floatToIntBits(x);
    int iy = floatToIntBits(y);
    if (ix == iy)
      return 0;
    if (x != x)
      return 1;
    if (y != y)
      return -1;
  }
else
  {
    // Normal cases, or positive/negative zeros
    if (x > y)
      return 1;
    if (y > x)
      return -1;

    // Either equal or positive/negative zeros
    if (x == 0 && y == 0)
      {
        int ix = floatToIntBits(x);
        int iy = floatToIntBits(y);
        int ix_sign = ix >> 31;
        int iy_sign = iy >> 31;
        return ix_sign - iy_sign;
      }

    return 0;
  }

But maybe I am not giving the compiler/jit enough credit for actually
producing such an optimized version in the first place. And this isn't
actually the same as your code since I moved the NaN checks first again.
The above assumes the NaN test (x != x) is relatively cheap since NaN is
a canonical value is java. That might be a wrong assumption. You might
be able to do something more clever with the actual ordering of floats
according to > (since if any argument is NaN or both are zero it
evaluates to false) or with the ordering of floatToIntBits, but my head
was spinning trying to keep all the corner cases right (this calls for
expanding the Mauve Float.compareTo() test).

Thanks for the feedback. I think your code may well be better. I think 
its good to have identified that there is at least an ~8% potential 
speed up on IA32 with my code, and potentially better with yours. There 
is of course a big dependence on what values are being compared and on 
the architecture. For the arrays of floats I'm seeing in the Jikes RVM a 
lot of the values are identical (either 0.0 or 1.0). With your code that 
has a path of 6 compares to get through. With mine its just one compare, 
but because a floating point compare of 0.0 and -0.0 will yield that 
they are identical I'm forced into doing it on their int bit 
representation to keep the 0.0 case sane.

Here's my hand wavy costing of the now 3 ways we have a doing things:

Case: x == y neither are NaN or 0.0
Newer Cost: 5 float compares
New Cost: 2 float as ints, 1 int compare
Old Cost: 5 float compares

Case: x < y and different sign
Newer Cost: 4 float compares
New Cost: 2 float as ints, 2 int compares, 2 float compares, 2 shifts, 
one subtract
Old Cost: 6 float compares

Case: x < y and same sign
Newer Cost: 4 float compares
New Cost: 2 float as ints, 2 int compares, 3 float compares, 2 shifts
Old Cost: 6 float compares

Case: x is NaN
Newer Cost: 2 float compares (I think you could reuse the original code 
which is slightly cheaper than yours)
New Cost: 2 float as ints, 1 int compare, 1 float compare
Old Cost: 2 float compares

Case: y is NaN
Newer Cost: 2 float compares (again assuming reuse of the original code)
New Cost: 2 float as ints, 1 int compare, 2 float compares
Old Cost: 2 float compares

Case: x == 0.0 and y == 0.0
Newer Cost: 2 float as ints, 6 float compares, 2 shifts, one subtract
New Cost: either 2 float as ints, 1 int compare or 2 float as ints, 2 
int compares, 2 float compares, 2 shifts, one subtract
Old Cost: 4 float compares, 2 float divides, 1 float subtract, 1 float 
to int

So I think the cases where the different ways win out are: most values 
are </> then the newer way has the shortest path, most values are == 
then the new way has the shortest path, for two zeros (common in Jikes 
RVM :-( ) the new way wins out, and no one really cares about NaNs :-)

I agree about the desirable properties of your code, and I think they 
also address Pinski's concern, that floatToIntBits on the common path 
seems bad. I also agree that its good to have a short < and > path. 
Possibly the only redeeming feature of the new code was that it made == 
cheap (depending on the floatToIntBits cost). So my variant of your code 
would be:

// First check for NaNs
if (x != x) {
 if (y != y) return 0;
 else return 1;
}
else if (y != y) {
 return -1;
}
else
 {
   // Normal cases, or positive/negative zeros
   if (x > y)
     return 1;
   if (y > x)
     return -1;

   // Either equal or positive/negative zeros
   if (x == 0 && y == 0)
     {
       int ix = floatToIntBits(x);
       int iy = floatToIntBits(y);
       int ix_sign = ix >> 31;
       int iy_sign = iy >> 31;
       return ix_sign - iy_sign;
     }

   return 0;
 }

I'll wait for some more feedback on the list before sending this in as a 
patch. I guess in any case we should separate this out in 
VMFloat/VMDouble as I can imagine for a CPU lacking an FPU you'd want 
another choice.

Regards,
Ian