On Thu, Jul 29, 2010 at 10:33 PM, Dean Rasheed <dean.a.rasheed@xxxxxxxxx> wrote: > On 28 July 2010 02:58, Howard Rogers <hjr@xxxxxxxxxx> wrote: >> For what it's worth, I wrote up the performance comparison here: >> http://diznix.com/dizwell/archives/153 >> > > Thanks, very interesting results. I wonder, are the results being > sorted by the database? The performance degradation for large numbers > of results might be explained by it switching over from an internal to > an external sort, in which case tweaking work_mem might make a > difference. > > Of course this is pure speculation without the EXPLAIN ANALYSE output. > > Regards, > Dean Yes, the results were being sorted. I did various tests, changing work_mem, shared_buffers and much else, one by one, until I arrived at the combination of settings that gave me the best 'total search time' results. Personally, I couldn't see any difference in the explain plans, but I was in a bit of a hurry and I may have missed it. For the search term 'woman', which matches 1,590,275 documents, here's the explain plan: "Sort (cost=185372.88..185372.93 rows=20 width=312) (actual time=10537.152..10537.154 rows=20 loops=1)" " Sort Key: a.rf, a.sort_id" " Sort Method: quicksort Memory: 48kB" " -> Result (cost=109119.55..185372.45 rows=20 width=312) (actual time=4309.020..10537.116 rows=20 loops=1)" " -> Append (cost=109119.55..185372.45 rows=20 width=312) (actual time=4309.018..10537.108 rows=20 loops=1)" " -> Subquery Scan a (cost=109119.55..109119.68 rows=10 width=312) (actual time=4309.018..4309.026 rows=10 loops=1)" " -> Limit (cost=109119.55..109119.58 rows=10 width=641) (actual time=4309.016..4309.019 rows=10 loops=1)" " -> Sort (cost=109119.55..109121.94 rows=957 width=641) (actual time=4309.014..4309.015 rows=10 loops=1)" " Sort Key: search_rm.sort_id" " Sort Method: top-N heapsort Memory: 35kB" " -> Bitmap Heap Scan on search_rm (cost=6651.07..109098.87 rows=957 width=641) (actual time=272.851..4021.458 rows=583275 loops=1)" " Recheck Cond: (to_tsvector('english'::regconfig, (textsearch)::text) @@ '''woman'''::tsquery)" " Filter: ((bitand(sales_method_code, 1) > 0) AND (bitand(subsiter, 1) > 0) AND (bitand(filetype, 1) > 0))" " -> Bitmap Index Scan on rmsearch_idx (cost=0.00..6650.83 rows=25826 width=0) (actual time=165.711..165.711 rows=586235 loops=1)" " Index Cond: (to_tsvector('english'::regconfig, (textsearch)::text) @@ '''woman'''::tsquery)" " -> Subquery Scan b (cost=76252.65..76252.77 rows=10 width=312) (actual time=6228.073..6228.080 rows=10 loops=1)" " -> Limit (cost=76252.65..76252.67 rows=10 width=727) (actual time=6228.072..6228.075 rows=10 loops=1)" " -> Sort (cost=76252.65..76254.29 rows=655 width=727) (actual time=6228.071..6228.072 rows=10 loops=1)" " Sort Key: search_rf.sort_id" " Sort Method: top-N heapsort Memory: 38kB" " -> Bitmap Heap Scan on search_rf (cost=5175.18..76238.49 rows=655 width=727) (actual time=363.684..5748.279 rows=1007000 loops=1)" " Recheck Cond: (to_tsvector('english'::regconfig, (textsearch)::text) @@ '''woman'''::tsquery)" " Filter: ((bitand(sales_method_code, 1) > 0) AND (bitand(subsiter, 1) > 0) AND (bitand(filetype, 1) > 0))" " -> Bitmap Index Scan on rfsearch_idx (cost=0.00..5175.02 rows=17694 width=0) (actual time=242.859..242.859 rows=1030282 loops=1)" " Index Cond: (to_tsvector('english'::regconfig, (textsearch)::text) @@ '''woman'''::tsquery)" "Total runtime: 10538.832 ms" And here's the plan for the search term "clover", which matches only 2,808 records in total: " -> Result (cost=109119.55..185372.45 rows=20 width=312) (actual time=16.807..23.990 rows=20 loops=1)" " -> Append (cost=109119.55..185372.45 rows=20 width=312) (actual time=16.806..23.985 rows=20 loops=1)" " -> Subquery Scan a (cost=109119.55..109119.68 rows=10 width=312) (actual time=16.806..16.812 rows=10 loops=1)" " -> Limit (cost=109119.55..109119.58 rows=10 width=641) (actual time=16.805..16.807 rows=10 loops=1)" " -> Sort (cost=109119.55..109121.94 rows=957 width=641) (actual time=16.804..16.805 rows=10 loops=1)" " Sort Key: search_rm.sort_id" " Sort Method: top-N heapsort Memory: 35kB" " -> Bitmap Heap Scan on search_rm (cost=6651.07..109098.87 rows=957 width=641) (actual time=1.054..15.577 rows=1807 loops=1)" " Recheck Cond: (to_tsvector('english'::regconfig, (textsearch)::text) @@ '''clover'''::tsquery)" " Filter: ((bitand(sales_method_code, 1) > 0) AND (bitand(subsiter, 1) > 0) AND (bitand(filetype, 1) > 0))" " -> Bitmap Index Scan on rmsearch_idx (cost=0.00..6650.83 rows=25826 width=0) (actual time=0.615..0.615 rows=1807 loops=1)" " Index Cond: (to_tsvector('english'::regconfig, (textsearch)::text) @@ '''clover'''::tsquery)" " -> Subquery Scan b (cost=76252.65..76252.77 rows=10 width=312) (actual time=7.161..7.166 rows=10 loops=1)" " -> Limit (cost=76252.65..76252.67 rows=10 width=727) (actual time=7.161..7.163 rows=10 loops=1)" " -> Sort (cost=76252.65..76254.29 rows=655 width=727) (actual time=7.160..7.161 rows=10 loops=1)" " Sort Key: search_rf.sort_id" " Sort Method: top-N heapsort Memory: 35kB" " -> Bitmap Heap Scan on search_rf (cost=5175.18..76238.49 rows=655 width=727) (actual time=0.433..6.642 rows=1001 loops=1)" " Recheck Cond: (to_tsvector('english'::regconfig, (textsearch)::text) @@ '''clover'''::tsquery)" " Filter: ((bitand(sales_method_code, 1) > 0) AND (bitand(subsiter, 1) > 0) AND (bitand(filetype, 1) > 0))" " -> Bitmap Index Scan on rfsearch_idx (cost=0.00..5175.02 rows=17694 width=0) (actual time=0.250..0.250 rows=1045 loops=1)" " Index Cond: (to_tsvector('english'::regconfig, (textsearch)::text) @@ '''clover'''::tsquery)" "Total runtime: 24.143 ms" I can't see any change to the sorting behaviour there. Work_mem was set to 4096MB, shared buffers to 12228MB, temp_buffers to 1024MB, effective_cache_size to 18442MB. Sadly, I won't be able to provide much further analysis or information, because the box concerned is being wiped. The MD decided that, as a matter of corporate governance, he couldn't punt the company on PostgreSQL, so my experimenting days are over. Back to Oracle: slower, but with a support contract he can sue on, I guess! Regards HJR -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general