So let me see if I understand, when searching for everyone within radius
"r" of point (a,b), the gist index will be used like this:
* C is the circle centered on (a,b) of radius "r"
1.
Traverse down the tree, starting at the root. Only go down nodes whose
bounding-box has a non-empty intersection with the circle C (how fast is
this verification?)
2. Once the leaves are reached, verify for everyone of them whether they're inside C or not, returning those that are.
If
this really is how it happens, then I ask: What about boxes that
intersect (does that happen)? What if the boxes aren't "nice" (boxes
with a very small desired intersection with the circle and a very large
quantity of unwanted rows).
Also, you've said that with b-tree indexes on two orthogonal
coordinates (two columns) postgresql would need to check the ENTIRE
vertical strip bounded by a-1 and a+1 and the ENTIRE horizontal strip
bounded by b-1 and b+1 (two limitless, "infinite" rectangles)? This
leads me to another question:
- Isn't there an equality check between all the rows returned by
both indexes, and then the actual distance calculations are performed
only for those returned by both indexes? What if I have many more
indexes on my table, how are things done?
And if I may, one last question:
- Is box-bounding the index strategy used for all geometric operations with gist indexes?
Also, to Oleg:
I
had personally tested the use of gist indexes (without limiting the
number of returned rows, however) for more than 15 million rows (with
their coordinates distributed a VERY LARGE area, sadly). The results
were still impressive to me (although I didn't know what to expect,
maximum running times of around 17ms seemed great to me!).
And sorry for sending this message to your personal email (my mistake).
Thanks a lot for all the help, if you can lead me to any docs/articles, I'll gladly read them.