Merlin Moncure wrote: > On 2/5/07, Mark Stosberg <mark@xxxxxxxxxxxxxxx> wrote: >> Bruno Wolff III wrote: >> > On Sat, Feb 03, 2007 at 14:00:26 -0500, >> > Mark Stosberg <mark@xxxxxxxxxxxxxxx> wrote: >> >> I'm using geo_distance() from contrib/earthdistance would like to >> find a >> >> way to spend up the geo distance calculation if possible. This is >> for a >> >> proximity search: "Show me adoptable pets within 250 miles of this >> >> zipcode". >> > >> > If you are using the "cube" based part of the earth distance package, >> > then you can use gist indexes to speed those searches up. >> >> Thanks for the tip. Any idea what kind of improvement I can expect to >> see, compared to using geo_distance()? > > a lot. be aware that gist takes longer to build than btree, but very > fast to search. Index search and filter to box is basically an index > lookup (fast!). for mostly static datasets that involve a lot of > searching, gist is ideal. The documentation in contrib/ didn't provide examples of how to create or the index or actually a the proximity search. Here's what I figured out to do: I added a new column as type 'cube': ALTER table zipcodes add column earth_coords cube; Next I converted the old lat/lon data I had stored in a 'point' type to the new format: -- Make to get lat/lon in the right order for your data model! UPDATE zipcodes set earth_coords = ll_to_earth( lon_lat[1], lon_lat[0] ); Now I added a GIST index on the field: CREATE index earth_coords_idx on zipcodes using gist (earth_coords); Finally, I was able to run a query, which I could see used the index (by checking "EXPLAIN ANALYZE ..." select * from zipcodes where earth_box('(436198.322855334, 4878562.8732218, 4085386.43843934)'::cube,16093.44) @ earth_coords; It's also notable that the units used are meters, not miles like geo_distance(). That's what the magic number of "16093.44" is-- 10 miles converted to meters. When I benchmarked this query against the old geo_distance() variation, it was about 200 times faster (~100ms vs .5ms). However, my next step was to try a more "real world" query that involved a more complex where clause and a couple of table joins. So far, that result is coming out /slower/ with the new approach, even though the index is being used. I believe this may be cause of the additional results found that are outside of the sphere, but inside the cube. This causes additional rows that need processing in the joined tables. Could someone post an example of how to further refine this so the results more closely match what geo_distance returns() ? Any other indexing or optimization tips would be appreciated. Mark