Timothy Garnett <tgarnett@xxxxxxxxxxx> writes: > -- Problematic Query > dev=> explain analyze SELECT "exp_detls".id FROM "exp_detls" WHERE > ("exp_detls"."hts_code_id" IN (12,654)) LIMIT 1; > Limit (cost=0.00..158.18 rows=1 width=4) (actual time=9661.363..9661.363 > rows=0 loops=1) > -> Seq Scan on exp_detls (cost=0.00..1336181.90 rows=8447 width=4) > (actual time=9661.360..9661.360 rows=0 loops=1) > Filter: (hts_code_id = ANY ('{12,654}'::integer[])) > Total runtime: 9661.398 ms > (4 rows) > -- Using OR =, much faster, though more complicated plan then below > dev=> explain analyze SELECT "exp_detls".id FROM "exp_detls" WHERE > ("exp_detls"."hts_code_id" = 12 OR "exp_detls"."hts_code_id" = 654) LIMIT 1; > Limit (cost=162.59..166.29 rows=1 width=4) (actual time=0.029..0.029 > rows=0 loops=1) > -> Bitmap Heap Scan on exp_detls (cost=162.59..31188.14 rows=8370 > width=4) (actual time=0.028..0.028 rows=0 loops=1) > Recheck Cond: ((hts_code_id = 12) OR (hts_code_id = 654)) > -> BitmapOr (cost=162.59..162.59 rows=8370 width=0) (actual > time=0.027..0.027 rows=0 loops=1) > -> Bitmap Index Scan on > index_exp_detls_on_hts_code_id_and_data_month (cost=0.00..79.20 rows=4185 > width=0) (actual time=0.017..0.017 rows=0 loops=1) > Index Cond: (hts_code_id = 12) > -> Bitmap Index Scan on > index_exp_detls_on_hts_code_id_and_data_month (cost=0.00..79.20 rows=4185 > width=0) (actual time=0.007..0.007 rows=0 loops=1) > Index Cond: (hts_code_id = 654) > Total runtime: 0.051 ms > (9 rows) Well, the reason it likes the first plan is that that has a smaller estimated cost ;-). Basically this is a startup-time-versus-total-time issue: the cost of the seqscan+limit is estimated to be about 1/8447'th of the time to read the whole table, since it's estimating 8447 candidate matches and assuming that those are uniformly distributed in the table. Meanwhile, the bitmap scan has a significant startup cost because the entire indexscan is completed before we start to do any fetching from the heap. The overestimate of the number of matching rows contributes directly to overestimating the cost of the indexscan, too. It ends up being a near thing --- 158 vs 166 cost units --- but on the basis of these estimates the planner did the right thing. So, what you need to do to make this better is to get it to have a better idea of how many rows match the query condition; the overestimate is both making the expensive plan look cheap, and making the cheaper plan look expensive. Cranking up the statistics target for the hts_code_id column (and re-ANALYZEing) ought to fix it. If all your tables are this large you might want to just increase default_statistics_target across the board. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance