Re: Apparently useless bitmap scans

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am Mittwoch, 9. Mai 2007 16:29 schrieb Alvaro Herrera:
> Peter Eisentraut wrote:
> > There's another odd thing about this plan from yesterday.
>
> Is this still 8.2.1?  The logic to choose bitmap indexes was rewritten
> just before 8.2.4,

OK, upgrading to 8.2.4 fixes this odd plan choice.  The query does run
a bit faster too, but the cost estimate has actually gone up!

8.2.1:

                                                                                              QUERY PLAN                                                                                               
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=87142.18..87366.58 rows=11220 width=184) (actual time=7883.541..8120.647 rows=35000 loops=1)
   ->  Sort  (cost=87142.18..87170.23 rows=11220 width=184) (actual time=7883.471..7926.031 rows=35000 loops=1)
         Sort Key: eh_subj.header_body
         ->  Hash Join  (cost=46283.30..86387.42 rows=11220 width=184) (actual time=5140.182..7635.615 rows=35000 loops=1)
               Hash Cond: (eh_subj.email_id = email.email_id)
               ->  Bitmap Heap Scan on email_header eh_subj  (cost=11853.68..50142.87 rows=272434 width=104) (actual time=367.956..1719.736 rows=280989 loops=1)
                     Recheck Cond: ((mime_part_id = 0) AND (header_name = 'subject'::text))
                     ->  BitmapAnd  (cost=11853.68..11853.68 rows=27607 width=0) (actual time=326.507..326.507 rows=0 loops=1)
                           ->  Bitmap Index Scan on idx__email_header__header_body_subject  (cost=0.00..5836.24 rows=272434 width=0) (actual time=178.041..178.041 rows=280989 loops=1)
                           ->  Bitmap Index Scan on idx__email_header__header_name  (cost=0.00..5880.97 rows=281247 width=0) (actual time=114.574..114.574 rows=280989 loops=1)
                                 Index Cond: (header_name = 'subject'::text)
               ->  Hash  (cost=34291.87..34291.87 rows=11020 width=120) (actual time=4772.148..4772.148 rows=35000 loops=1)
                     ->  Hash Join  (cost=24164.59..34291.87 rows=11020 width=120) (actual time=3131.067..4706.997 rows=35000 loops=1)
                           Hash Cond: (mime_part.email_id = email.email_id)
                           ->  Seq Scan on mime_part  (cost=0.00..8355.81 rows=265804 width=12) (actual time=0.038..514.291 rows=267890 loops=1)
                                 Filter: (mime_part_id = 0)
                           ->  Hash  (cost=24025.94..24025.94 rows=11092 width=112) (actual time=3130.982..3130.982 rows=35000 loops=1)
                                 ->  Hash Join  (cost=22244.54..24025.94 rows=11092 width=112) (actual time=996.556..3069.280 rows=35000 loops=1)
                                       Hash Cond: (eh_from.email_id = email.email_id)
                                       ->  Bitmap Heap Scan on email_header eh_from  (cost=15576.58..16041.55 rows=107156 width=104) (actual time=569.762..1932.017 rows=280990 loops=1)
                                             Recheck Cond: ((mime_part_id = 0) AND (header_name = 'from'::text))
                                             ->  BitmapAnd  (cost=15576.58..15576.58 rows=160 width=0) (actual time=532.217..532.217 rows=0 loops=1)
                                                   ->  Bitmap Index Scan on dummy_index  (cost=0.00..3724.22 rows=107156 width=0) (actual time=116.386..116.386 rows=280990 loops=1)
                                                   ->  Bitmap Index Scan on idx__email_header__from_local  (cost=0.00..5779.24 rows=107156 width=0) (actual time=174.883..174.883 rows=280990 loops=1)
                                                   ->  Bitmap Index Scan on dummy2_index  (cost=0.00..5992.25 rows=107156 width=0) (actual time=173.575..173.575 rows=280990 loops=1)
                                       ->  Hash  (cost=6321.79..6321.79 rows=27694 width=8) (actual time=426.739..426.739 rows=35000 loops=1)
                                             ->  Index Scan using idx__email__time on email  (cost=0.00..6321.79 rows=27694 width=8) (actual time=50.000..375.021 rows=35000 loops=1)
                                                   Index Cond: (("time" >= '2007-05-05 17:01:59'::timestamp without time zone) AND ("time" < '2007-05-05 18:01:59'::timestamp without time zone))
 Total runtime: 8160.442 ms


8.2.4:

                                                                                            QUERY PLAN                                                                                            
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=100086.52..100658.46 rows=28597 width=182) (actual time=6063.766..6281.818 rows=35000 loops=1)
   ->  Sort  (cost=100086.52..100158.01 rows=28597 width=182) (actual time=6063.697..6105.215 rows=35000 loops=1)
         Sort Key: eh_subj.header_body
         ->  Hash Join  (cost=36729.27..97969.83 rows=28597 width=182) (actual time=3690.316..5790.094 rows=35000 loops=1)
               Hash Cond: (eh_subj.email_id = email.email_id)
               ->  Bitmap Heap Scan on email_header eh_subj  (cost=5903.20..63844.68 rows=267832 width=103) (actual time=214.699..1564.804 rows=280989 loops=1)
                     Recheck Cond: ((mime_part_id = 0) AND (header_name = 'subject'::text))
                     ->  Bitmap Index Scan on idx__email_header__header_body_subject  (cost=0.00..5836.24 rows=267832 width=0) (actual time=172.188..172.188 rows=280989 loops=1)
               ->  Hash  (cost=30468.98..30468.98 rows=28567 width=119) (actual time=3475.484..3475.484 rows=35000 loops=1)
                     ->  Hash Join  (cost=13773.73..30468.98 rows=28567 width=119) (actual time=1260.579..3409.443 rows=35000 loops=1)
                           Hash Cond: (eh_from.email_id = email.email_id)
                           ->  Index Scan using dummy_index on email_header eh_from  (cost=0.00..13286.00 rows=277652 width=103) (actual time=0.076..1391.974 rows=280990 loops=1)
                           ->  Hash  (cost=13429.63..13429.63 rows=27528 width=20) (actual time=1260.422..1260.422 rows=35000 loops=1)
                                 ->  Hash Join  (cost=1799.41..13429.63 rows=27528 width=20) (actual time=114.765..1206.500 rows=35000 loops=1)
                                       Hash Cond: (mime_part.email_id = email.email_id)
                                       ->  Seq Scan on mime_part  (cost=0.00..8355.81 rows=266589 width=12) (actual time=0.036..407.539 rows=267890 loops=1)
                                             Filter: (mime_part_id = 0)
                                       ->  Hash  (cost=1454.07..1454.07 rows=27627 width=8) (actual time=114.644..114.644 rows=35000 loops=1)
                                             ->  Index Scan using idx__email__time on email  (cost=0.00..1454.07 rows=27627 width=8) (actual time=0.144..63.017 rows=35000 loops=1)
                                                   Index Cond: (("time" >= '2007-05-05 17:01:59'::timestamp without time zone) AND ("time" < '2007-05-05 18:01:59'::timestamp without time zone))
 Total runtime: 6320.790 ms
(21 Zeilen)


The only significant change is that the first Bitmap Heap Scan (line 6)
became more expensive.  You will notice that in the old plan, you had a
pretty good correspondence of 10 cost units to 1 millisecond throughout,
whereas in the new plan that does not apply to said Bitmap Heap Scan.
I'm not sure whether that is cause for concern.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux