On Thu, 27 Apr 2006, Vivek Khera wrote:
On Apr 26, 2006, at 3:17 AM, Teodor Sigaev wrote:
We knows installation of tsearch2 working with 4 millions docs.
What are the design goals for the size of the source tables? My engineers
are telling me of things their friends have tried and have hit limits of
tsearch2. One was importing a large message board (millions of rows, a few
sentences of text per row) and ran into problems (which were not detailed).
Our interest is in using it for indexing mailing lists we host. We're
looking at about 100 or so messages per day right now, with potential growth.
Short of actually implementing it and loading up sample data, what
guidelines can you provide as to the limits of tsearch2 source data size?
I can imagine having 10+ million rows of 4k-byte to 10k-byte long messages
within a couple of years.
It should be no problem with inverted index we just posted. Search itself
is very fast ! The problem is intrinsic for relational database - read
data from disk. If you find 100,000 results and you want to rank them,
you have to read them from hd, which is slow. That's why we use cacheing
search daemon and on 5 mln blog and we could get 1mln search/day on
8Gb RAM server.
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@xxxxxxxxxx, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83