Re: application of KNN code to US zipcode searches?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I tried again to use KNN for a real-world query, and I was able to get
it to add an approximately 6x speed-up vs the cube search or
earthdistance methods ( from 300 ms to 50ms ).

I had to make some notable changes for the KNN index to be considered.

- Of course, I had to switch to using basic point/distance calculation.
  As previously noted, this still needs more work to confirm the
  accuracy and get the "distance" reported in miles.

- The query planner didn't like it when the "ORDER BY" referred to a
  column value instead of a static value, even when I believe it should
  know that the column value never changes. See this pseudo-query where
  we look-up the coordinates for 90210 once:

  EXPLAIN ANALYZE
  SELECT pets.pet_id,
      zipcodes.lon_lat <-> center.lon_lat AS radius
      FROM (SELECT lon_lat FROM zipcodes WHERE zipcode = '90210') AS
center, pets
      JOIN shelters USING (shelter_id)
      JOIN zipcodes USING (zipcode)
       ORDER BY postal_codes.lon_lat <-> center.lon_lat limit 1000;

  This didn't use the KNN index until I changed the "center.lon_lat" in
  the ORDER BY to an explicit point value. I'm not sure if that's
  expected, or something I should take up with -hackers.

  This could be worked around by doing a initial query to look-up this
  value, and then feed a static value into this query. That's not ideal,
  but the combination would still be faster.

- I had to drop the part of the WHERE clause which restricted the
  results to shelters within 50 miles from the target zipcode. However,
  I could set the "LIMIT" so high that I could get back "enough" pets,
  and then the application could trim out the results. Or, perhaps
  I could push this query down into a sub-select, and let PostgreSQL
  do a second pass to throw out some of the results.

In any case, with a real-world speed-up of 6x, this looks like it will
be worth it to us to continue to investigate.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux