On 06/08/2012 16:34, Tom Lane wrote:
Ok in that scenario we are back to square one. Following your suggestion my resultant query is this (the temporary table is tmp_tbl_messages)Ioannis Anagnostopoulos <ioannis@xxxxxxxxxx> writes:I think this is a pretty good plan and quite quick given the size of the table (88Million rows at present). However in real life the parameter where I search for msg_id is not an array of 3 ids but of 300.000 or more. It is then that the query forgets the plan and goes to sequential scan. Is there any way around?If you've got that many, any(array[....]) is a bad choice. I'd try putting the IDs into a VALUES(...) list, or even a temporary table, and then writing the query as a join. It is a serious mistake to think that a seqscan is evil when you're dealing with joining that many rows, btw. What you should probably be looking for is a hash join plan. regards, tom lane select ship_pos_messages.* from feed_all_y2012m08.ship_pos_messages join tmp_tbl_messages on (ship_pos_messages.msg_id = tmp_tbl_messages.msg_id) where extract('day' from msg_date_rec) = 1 AND date_trunc('day', msg_date_rec) = '2012-08-01'; which gives us the following explain analyse: "Merge Join (cost=1214220.48..3818359.46 rows=173574357 width=128) (actual time=465036.958..479089.731 rows=341190 loops=1)" " Merge Cond: (feed_all_y2012m08.ship_pos_messages.msg_id = tmp_tbl_messages.msg_id)" " -> Sort (cost=1178961.70..1179223.51 rows=104725 width=128) (actual time=464796.971..476579.208 rows=19512873 loops=1)" " Sort Key: feed_all_y2012m08.ship_pos_messages.msg_id" " Sort Method: external merge Disk: 1254048kB" " -> Append (cost=0.00..1170229.60 rows=104725 width=128) (actual time=0.033..438682.971 rows=19512883 loops=1)" " -> Seq Scan on ship_pos_messages (cost=0.00..0.00 rows=1 width=100) (actual time=0.000..0.000 rows=0 loops=1)" " Filter: ((date_part('day'::text, msg_date_rec) = 1::double precision) AND (date_trunc('day'::text, msg_date_rec) = '2012-08-01 00:00:00'::timestamp without time zone))" " -> Seq Scan on ship_a_pos_messages ship_pos_messages (cost=0.00..0.00 rows=1 width=100) (actual time=0.000..0.000 rows=0 loops=1)" " Filter: ((date_part('day'::text, msg_date_rec) = 1::double precision) AND (date_trunc('day'::text, msg_date_rec) = '2012-08-01 00:00:00'::timestamp without time zone))" " -> Index Scan using idx_ship_b_std_pos_messages_date_trunc on ship_b_std_pos_messages ship_pos_messages (cost=0.00..58657.09 rows=5269 width=128) (actual time=0.032..799.171 rows=986344 loops=1)" " Index Cond: (date_trunc('day'::text, msg_date_rec) = '2012-08-01 00:00:00'::timestamp without time zone)" " Filter: (date_part('day'::text, msg_date_rec) = 1::double precision)" " -> Index Scan using idx_ship_b_ext_pos_messages_date_trunc on ship_b_ext_pos_messages ship_pos_messages (cost=0.00..1694.64 rows=141 width=128) (actual time=0.026..20.661 rows=26979 loops=1)" " Index Cond: (date_trunc('day'::text, msg_date_rec) = '2012-08-01 00:00:00'::timestamp without time zone)" " Filter: (date_part('day'::text, msg_date_rec) = 1::double precision)" " -> Index Scan using idx_ship_a_pos_messages_wk0_date_trunc on ship_a_pos_messages_wk0 ship_pos_messages (cost=0.00..1109877.86 rows=99313 width=128) (actual time=0.029..435784.376 rows=18499560 loops=1)" " Index Cond: (date_trunc('day'::text, msg_date_rec) = '2012-08-01 00:00:00'::timestamp without time zone)" " Filter: (date_part('day'::text, msg_date_rec) = 1::double precision)" " -> Sort (cost=35258.79..36087.50 rows=331486 width=8) (actual time=239.908..307.576 rows=349984 loops=1)" " Sort Key: tmp_tbl_messages.msg_id" " Sort Method: quicksort Memory: 28694kB" " -> Seq Scan on tmp_tbl_messages (cost=0.00..4863.86 rows=331486 width=8) (actual time=0.047..55.227 rows=349984 loops=1)" "Total runtime: 479336.869 ms" Which is a Merge join and not a hash. Any ideas how to make it a hash join? Kind Regards Yiannis |