Hi Robert, thank you for your help. I tried to cluster the table on hyps_wordid_index and the query execution time dropped from 4.43 to 0.19 seconds which is not that far from Lucene's performance of 0.10 second. Thanks a lot! Miso Fapso On 6 July 2010 02:25, Robert Haas <robertmhaas@xxxxxxxxx> wrote: > On Thu, Jul 1, 2010 at 6:34 PM, Michal Fapso <michal.fapso@xxxxxxxxx> wrote: >> It took about 4.5 seconds. If I rerun it, it takes >> less than 2 miliseconds, but it is because of the cache. I need to >> optimize the first-run. >> >> laptop ASUS, CPU dual core T2300 1.66GHz, 1.5G RAM >> >> EXPLAIN ANALYZE SELECT h1.docid >> FROM hyps AS h1 >> WHERE h1.wordid=65658; >> >> Bitmap Heap Scan on hyps h1 (cost=10.97..677.09 rows=171 width=4) >> (actual time=62.106..4416.864 rows=343 loops=1) >> Recheck Cond: (wordid = 65658) >> -> Bitmap Index Scan on hyps_wordid_index (cost=0.00..10.92 >> rows=171 width=0) (actual time=42.969..42.969 rows=343 loops=1) >> Index Cond: (wordid = 65658) >> Total runtime: 4432.015 ms >> >> If I run the same query in Lucene search engine, it takes 0.105 >> seconds on the same data which is quite a huge difference. > > So PostgreSQL is reading 343 rows from disk in 4432 ms, or about 12 > ms/row. I'm not an expert on seek times, but that might not really be > that unreasonable, considering that those rows may be scattered all > over the index and thus it may be basically random I/O. Have you > tried clustering hyps on hyps_wordid_index? If you had a more > sophisticated disk subsystem you could try increasing > effective_io_concurrency but that's not going to help with only one > spindle. > > If you run the same query in Lucene and it takes only 0.105 s, then > Lucene is obviously doing a lot less I/O. I doubt that any amount of > tuning of your existing schema is going to produce that kind of result > on PostgreSQL. Using the full-text search stuff, or a gin index of > some kind, might get you closer, but it's hard to beat a > special-purpose engine that implements exactly the right algorithm for > your use case. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise Postgres Company > -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance