Search Postgresql Archives

Re: still gin index creation takes forever

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 12 Nov 2008 15:18:05 -0500
Tom Lane <tgl@xxxxxxxxxxxxx> wrote:

> > So, in that case process can insert about 1000 ItemPointers per
> > one data tree lookup, in opposite case it does 1000 lookups in
> > data tree.

> I see.  So this could explain Ivan's issue if his table contains
> large numbers of repeated GIN keys.  Ivan, is that what your data
> looks like?

Well if by GIN keys you mean lexemes it could be. But I wouldn't say
this circumstance is uncommon among users of tsearch. I'd expect
other people had used tsearch2 to search through titles, authors and
publishers of books, so if that was the problem I'd expect the
problem to come up earlier.
Actually tsearch2 is not completely tuned up, since I still have to
"mix" Italian and English configuration to get rid of some more
stop words etc... that may increase the number of repetitions, but I
doubt this only put me in a corner case.

Anyway trying to answer in a more objective way to your question I
ran:

SELECT * FROM ts_stat('SELECT ft1idx FROM catalog_items') order by
nentry desc, ndoc desc limit 20;

It ran over 9h and I still wasn't able to get the answer.
I killed psql client that was running it and postgres continued to
eat 100% CPU for a while till I stopped it.

Considering that running:
SELECT * FROM ts_stat('SELECT ft1idx FROM catalog_items limit 50000')
order by nentry desc, ndoc desc limit 20;
returned in less than 2 minutes and catalog_items has a bit less
than 1M record... there is still something weird.

"springer";10824;10833
"e";7703;8754
"di";6815;7771
"il";5622;6168
"la";4989;5407
"hall";4357;4416
"prentic";4321;4369
"l";3920;4166
"del";3092;3281
"edizioni";2465;2465
"della";2292;2410
"m";2283;2398
"dell";2150;2281
"j";1967;2099
"d";1789;1864
"per";1685;1770
"longman";1671;1746
"le";1656;1736
"press";1687;1687
"de";1472;1564

examining 90K records took a bit more than 6min.

I'll try to move everything on another box and see what happens.

-- 
Ivan Sergio Borgonovo
http://www.webthatworks.it


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux