Preliminaries: pg 8.3.3 on ubuntu 7.10 4x1.66GHz CPUs 10G ram interesting postgresql.conf settings: max_connections = 300 shared_buffers = 3000MB work_mem = 32MB random_page_cost = 1.5 effective_cache_size = 5000MB default_statistics_target = 30 lc_messages = 'en_US.UTF-8' OK, We're running mnogosearch, and we have many many different "sites" on it. So, there's a dict table with all the words in it, and an url table with all the sites in it. The typical query to pull them in looks like this: SELECT dict.url_id,dict.intag FROM dict, url WHERE dict.word='lesson' AND url.rec_id=dict.url_id AND url.site_id IN ('-259409521'); random_page_cost can be anything from 1.5 to 4.0 and I'll get this plan most times: Hash Join (cost=2547.45..11396.26 rows=37 width=8) (actual time=3830.936..16304.325 rows=22 loops=1) Hash Cond: (dict.url_id = url.rec_id) -> Bitmap Heap Scan on dict (cost=121.63..8939.55 rows=6103 width=8) (actual time=1590.001..16288.708 rows=16172 loops=1) Recheck Cond: (word = 'lesson'::text) -> Bitmap Index Scan on dict_word_url_id (cost=0.00..120.11 rows=6103 width=0) (actual time=1587.322..1587.322 rows=16172 loops=1) Index Cond: (word = 'lesson'::text) -> Hash (cost=2402.07..2402.07 rows=1900 width=4) (actual time=5.583..5.583 rows=1996 loops=1) -> Bitmap Heap Scan on url (cost=84.09..2402.07 rows=1900 width=4) (actual time=1.048..4.444 rows=1996 loops=1) Recheck Cond: (site_id = (-259409521)) -> Bitmap Index Scan on url_siteid (cost=0.00..83.61 rows=1900 width=0) (actual time=0.642..0.642 rows=1996 loops=1) Index Cond: (site_id = (-259409521)) Total runtime: 16304.488 ms Second time through, the run time is much faster: Hash Join (cost=2547.45..11396.26 rows=37 width=8) (actual time=15.690..41.572 rows=22 loops=1) Hash Cond: (dict.url_id = url.rec_id) -> Bitmap Heap Scan on dict (cost=121.63..8939.55 rows=6103 width=8) (actual time=7.090..29.589 rows=16172 loops=1) Recheck Cond: (word = 'lesson'::text) -> Bitmap Index Scan on dict_word_url_id (cost=0.00..120.11 rows=6103 width=0) (actual time=4.677..4.677 rows=16172 loops=1) Index Cond: (word = 'lesson'::text) -> Hash (cost=2402.07..2402.07 rows=1900 width=4) (actual time=5.535..5.535 rows=1996 loops=1) -> Bitmap Heap Scan on url (cost=84.09..2402.07 rows=1900 width=4) (actual time=1.037..4.439 rows=1996 loops=1) Recheck Cond: (site_id = (-259409521)) -> Bitmap Index Scan on url_siteid (cost=0.00..83.61 rows=1900 width=0) (actual time=0.635..0.635 rows=1996 loops=1) Index Cond: (site_id = (-259409521)) Total runtime: 41.648 ms Note that big change in the Bitmap Heap Scan near the top. I assume that the first time it's hitting the disk or something? If I do this: set enable_mergejoin=off; set enable_hashjon=off; this forces the planner to switch away from a bitmap heap scan on dict. I now get a plan like this: Nested Loop (cost=84.09..23997.47 rows=37 width=8) (actual time=241.436..530.531 rows=28 loops=1) -> Bitmap Heap Scan on url (cost=84.09..2402.07 rows=1900 width=4) (actual time=0.980..4.557 rows=1996 loops=1) Recheck Cond: (site_id = (-259409521)) -> Bitmap Index Scan on url_siteid (cost=0.00..83.61 rows=1900 width=0) (actual time=0.577..0.577 rows=1996 loops=1) Index Cond: (site_id = (-259409521)) -> Index Scan using dict_word_url_id on dict (cost=0.00..11.35 rows=1 width=8) (actual time=0.247..0.263 rows=0 loops=1996) Index Cond: ((dict.word = 'assembly'::text) AND (dict.url_id = url.rec_id)) Total runtime: 530.607 ms Note that there's a different key work, since the old one would be cached. Second run looks similar, but down to 50 milliseconds since the data are now cached. So, anyone got any hints? I'd prefer not to have to set those two methods off at the db level, and might move them into the search engine at least, but I'd much rather change a setting on the server. For now, the load on the server during the day is down from 8 to 15 to 0.30, so I can live with the two methods turned off for now, but I know that I'll hit something where nested_loop is too slow eventually.