Because I know the 25 closest are going to be fairly close in each
coordinate, I did try a multicolumn index on the last 6 columns and
used a +/- 0.1 or 0.2 tolerance on each. (The 25 best are very probably inside
that hypercube on the distribution of data in question.)
This hypercube tended to have 10-20K records, and took at least 4
seconds to retrieve. I was a little surprised by how long that took.
So I'm wondering if my data representation is off the wall.
I should mention I also tried a cube index using gist on all 114
elements, but CREATE INDEX hadn't finished in 36 hours, when I killed
it, and I wasn't in retrospect sure an index that took something like
6GB by itself would be helpful on a 2GB of RAM box.
MK> I don't think that will work for the vector norm i.e:
MK> |x - y| = sqrt(sum over j ((x[j] - y[j])^2))
MK> Cheers
MK> Mark
--
Sincerely,
Andrew Lazarus mailto:andrew@xxxxxxxxxxxx
BEGIN:VCARD
VERSION:2.1
N:Lazarus;Andrew;;;Ph.D.
FN:Andrew Lazarus, Ph.D.
EMAIL;PREF;INTERNET:andrew@xxxxxxxxxxxx
TITLE:Director of R&D
ADR;WORK:;800-366-0688;3028 Fillmore Street;San Francisco;CA;94123;USA
LABEL;WORK;ENCODING=QUOTED-PRINTABLE:800-366-0688=0D=0A3028 Fillmore S=
treet=0D=0ASan Francisco=0D=0ACA=0D=0A94123=0D=0AUSA
X-GENDER:Male
REV:18991230T080000Z
END:VCARD