On Sat, 21 Jan 2006, Martijn van Oosterhout wrote:
However, IMHO, this algorithm is optimising the wrong thing. It
shouldn't be trying to split into sets that are far apart, it should be
trying to split into sets that minimize the number of set bits (ie
distance from zero), since that's what's will speed up searching.
Martijn, you're right! We want not only to split page to very
different parts, but not to increase the number of sets bits in
resulted signatures, which are union (OR'ed) of all signatures
in part. We need not only fast index creation (thanks, Tom !),
but a better index. Some information is available here
http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_internals
There are should be more detailed document, but I don't remember where:)
That's harder though (this algorithm does approximate it sort of)
and I havn't come up with an algorithm yet
Don't ask how hard we thought :)
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@xxxxxxxxxx, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83