Re: gist indexes for distance calculations

Marcelo Zabani <mzabani@xxxxxxxxx> · Thu, 30 Sep 2010 23:50:14 -0300

So let me see if I understand, when searching for everyone within radius
 "r" of point (a,b), the gist index will be used like this:
* C is the circle centered on (a,b) of radius "r"

1.
 Traverse down the tree, starting at the root. Only go down nodes whose 
bounding-box has a non-empty intersection with the circle C (how fast is
 this verification?)

2. Once the leaves are reached, verify for everyone of them whether they're inside C or not, returning those that are.

If
 this really is how it happens, then I ask: What about boxes that 
intersect (does that happen)? What if the boxes aren't "nice" (boxes 
with a very small desired intersection with the circle and a very large 
quantity of unwanted rows).

Also, you've said that with b-tree indexes on two orthogonal 
coordinates (two columns) postgresql would need to check the ENTIRE 
vertical strip bounded by a-1 and a+1 and the ENTIRE horizontal strip 
bounded by b-1 and b+1 (two limitless, "infinite" rectangles)? This 
leads me to another question:

- Isn't there an equality check between all the rows returned by 
both indexes, and then the actual distance calculations are performed 
only for those returned by both indexes? What if I have many more 
indexes on my table, how are things done?

And if I may, one last question:
- Is box-bounding the index strategy used for all geometric operations with gist indexes?

Also, to Oleg:
I
 had personally tested the use of gist indexes (without limiting the 
number of returned rows, however) for more than 15 million rows (with 
their coordinates distributed a VERY LARGE area, sadly). The results 
were still impressive to me (although I didn't know what to expect, 
maximum running times of around 17ms seemed great to me!).
And sorry for sending this message to your personal email (my mistake).

Thanks a lot for all the help, if you can lead me to any docs/articles, I'll gladly read them.