Hello, I want to report that I have now solved the challenges I ran into using KNN for US zipcode searching. I've found the new approach to not only be viable, but to benchmark about 3x faster for our own real-world application than the previous approach we used, involving cube_distance() and earth_box(). Here's some details about my research so far. To evaluate it, I installed PostgreSQL 9.1 and a current PostGIS 2.0 snapshot (not yet released as stable). A primary challenge I had to solve was that KNN is designed for a slightly different problem than what I needed to solve. I need to answer the question: "What are all the objects that are in zipcodes with 50 miles of a given zipcode?" However, KNN only directly provides a performance boost to this variation: "What are the N nearest objects to this point?" Just adding a "WHERE clause" to check the 50 mile rule would erase the benefits of KNN, which works through an "ORDER BY" clause. I solved my issue by using a "WITH" clause that creates a pseudo-table called "nearby_zipcodes". In this example, I select all the zipcodes that are within 50 miles of the "47374" zipcode. The trick I've employed is that I've set the LIMIT value to 286-- exactly the number of zipcodes within 50 miles of 47374. My plan is to add another column to my "zipcodes" table for each of the small number distances I need to search. Then, when I load new zipcodes I can pre-compute how many zipcodes would be found at this distance. This have approach would not have worked without a "WITH" clause, or some equivalent, because the number of objects within the radius is not known, but the number of nearby zipcodes is fixed. This approach allows me to get the performance benefits of KNN, while also returning exactly those objects within 50 miles of my target zipcode, by JOINing on the "nearby_zipcodes" table: WITH nearby_zipcodes AS ( SELECT zipcode, st_distance_sphere(lonlat_point, (SELECT lonlat_point from zipcodes WHERE zipcode = '47374')) / 1609.344 as radius FROM zipcodes ORDER BY lonlat_point <-> (SELECT lonlat_point from zipcodes WHERE zipcode = '47374') LIMIT 286 ) SELECT ... You might also notice that "st_distance_sphere()" doesn't mean exactly the same thing as the "<->" operator. That's something I could refine going forward. That's what I've got so far. How could I improve this further? For reference, here are the key parts of the "zipcodes" table: # \d zipcodes Table "public.zipcodes" Column | Type | Modifiers --------------+-----------------------+----------- zipcode | character varying(5) | not null lonlat_point | geometry(Point,4326) | Indexes: "zipcodes_pkey" PRIMARY KEY, btree (zipcode) "lonlat_point_idx" gist (lonlat_point) Thanks for the peer review! Mark Stosberg -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance