8.4 optimization regression?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I am in the progress of an 8.3 to 8.4 upgrade for a customer. I seem to have stumbled upon what looks like a regression. The two databases (8.3.14 and 8.4.8) have identical tuning parameters (where that makes sense) and run on identical hardware. Both databases are regularly vacuumed and analyzed (not by autovacuum), and performing an ANALYZE does not change the plans shown below.

The query below is a slightly simplified version of what a report uses. I note that 8.3 chooses a better plan (approx twice as fast) and 8.4 is massively *overestimating* the rows from the top hash join (8.3 massively underestimates them mind you). This row overestimating becomes more of an issue when the remaining subqueries etc are added into the query below - to the point where the 8.4 runtime goes into days unless. Now I have some ways around that (convert NOT IN to NOT EXISTS), but there seems to be nothing I can do to speed this base query, which unfortunately is a common construction in the application. Any ideas?

Query and plans:

         EXPLAIN ANALYZE
         SELECT 1
         FROM correspondence c
         JOIN audit_log gen_al on (gen_al.audit_id = generated_audit_id)
         JOIN correspondence_master cm using(corresp_master_id)
         JOIN person p on(p.person_id = cm.person_id)

WHERE c.corresp_type_id IN ('CL11', 'CL11A', 'CL12', 'CL15', 'CL15A', 'CL16', 'DM_1', 'DM_2')
             AND cm.person_id IS NOT NULL
AND gen_al.audit_timestamp > ('2011-08-19 13:05'::timestamp - '6 months'::interval)
             AND p.active = true
             AND p.exclude_walklist_alt = false
             AND p.postal_address_id is null
             AND p.unpublished = false
             AND p.enrolment_status_id in ('E', 'T')
             AND p.person_type in ('M', 'D', 'O')

QUERY PLAN 8.3
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=1710864.04..1835979.46 rows=33873 width=0) (actual time=34520.372..39762.042 rows=220336 loops=1)
   Hash Cond: (p.person_id = cm.person_id)
-> Seq Scan on person p (cost=0.00..115154.11 rows=2566020 width=4) (actual time=0.066..3347.560 rows=3101074 loops=1) Filter: (active AND (NOT exclude_walklist_alt) AND (postal_address_id IS NULL) AND (NOT unpublished) AND (enrolment_status_id = ANY ('{E,T}'::text[])) AND (person_type = ANY ('{M,D,O}'::text[]))) -> Hash (cost=1703761.54..1703761.54 rows=568200 width=4) (actual time=34519.041..34519.041 rows=251911 loops=1) -> Hash Join (cost=793298.59..1703761.54 rows=568200 width=4) (actual time=8383.900..34414.612 rows=251911 loops=1)
               Hash Cond: (cm.corresp_master_id = c.corresp_master_id)
-> Seq Scan on correspondence_master cm (cost=0.00..778644.91 rows=33636281 width=12) (actual time=0.045..9651.799 rows=33966957 loops=1)
                     Filter: (person_id IS NOT NULL)
-> Hash (cost=785006.92..785006.92 rows=663333 width=8) (actual time=8260.951..8260.951 rows=358582 loops=1) -> Hash Join (cost=233226.31..785006.92 rows=663333 width=8) (actual time=7042.396..8140.430 rows=358582 loops=1) Hash Cond: (c.generated_audit_id = gen_al.audit_id) -> Bitmap Heap Scan on correspondence c (cost=74599.32..527309.84 rows=4103876 width=16) (actual time=869.067..2474.081 rows=5297729 loops=1) Recheck Cond: (corresp_type_id = ANY ('{CL11,CL11A,CL12,CL15,CL15A,CL16,DM_1,DM_2}'::text[])) -> Bitmap Index Scan on corresp_type_fk (cost=0.00..73573.35 rows=4103876 width=0) (actual time=834.201..834.201 rows=5297729 loops=1) Index Cond: (corresp_type_id = ANY ('{CL11,CL11A,CL12,CL15,CL15A,CL16,DM_1,DM_2}'::text[])) -> Hash (cost=111084.39..111084.39 rows=2897808 width=8) (actual time=3373.619..3373.619 rows=2854079 loops=1) -> Index Scan using audit_log_audit_timestamp on audit_log gen_al (cost=0.00..111084.39 rows=2897808 width=8) (actual time=0.164..2322.894 rows=2854079 loops=1) Index Cond: (audit_timestamp > '2011-02-19 13:05:00'::timestamp without time zone)
 Total runtime: 39787.258 ms



QUERY PLAN 8.4
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=914274.32..2577396.93 rows=3757429 width=0) (actual time=11425.505..62959.205 rows=220336 loops=1)
   Hash Cond: (c.generated_audit_id = gen_al.audit_id)
-> Hash Join (cost=761247.58..2313627.39 rows=3757429 width=8) (actual time=8335.459..58598.883 rows=3087353 loops=1)
         Hash Cond: (cm.person_id = p.person_id)
-> Hash Join (cost=604036.44..2031002.71 rows=4202314 width=12) (actual time=3850.257..49067.879 rows=3586540 loops=1)
               Hash Cond: (cm.corresp_master_id = c.corresp_master_id)
-> Seq Scan on correspondence_master cm (cost=0.00..777996.60 rows=33964871 width=12) (actual time=0.024..10206.688 rows=33966957 loops=1)
                     Filter: (person_id IS NOT NULL)
-> Hash (cost=530987.51..530987.51 rows=4202314 width=16) (actual time=3848.577..3848.577 rows=5297729 loops=1) -> Bitmap Heap Scan on correspondence c (cost=76346.23..530987.51 rows=4202314 width=16) (actual time=673.660..2272.497 rows=5297729 loops=1) Recheck Cond: (corresp_type_id = ANY ('{CL11,CL11A,CL12,CL15,CL15A,CL16,DM_1,DM_2}'::text[])) -> Bitmap Index Scan on corresp_type_fk (cost=0.00..75295.65 rows=4202314 width=0) (actual time=640.301..640.301 rows=5297729 loops=1) Index Cond: (corresp_type_id = ANY ('{CL11,CL11A,CL12,CL15,CL15A,CL16,DM_1,DM_2}'::text[])) -> Hash (cost=115091.03..115091.03 rows=2567289 width=4) (actual time=4484.737..4484.737 rows=3101076 loops=1) -> Seq Scan on person p (cost=0.00..115091.03 rows=2567289 width=4) (actual time=0.013..3406.661 rows=3101076 loops=1) Filter: (active AND (NOT exclude_walklist_alt) AND (postal_address_id IS NULL) AND (NOT unpublished) AND (enrolment_status_id = ANY ('{E,T}'::text[])) AND (person_type = ANY ('{M,D,O}'::text[]))) -> Hash (cost=107100.99..107100.99 rows=2799260 width=8) (actual time=2962.039..2962.039 rows=2854070 loops=1) -> Index Scan using audit_log_audit_timestamp on audit_log gen_al (cost=0.00..107100.99 rows=2799260 width=8) (actual time=0.181..2055.802 rows=2854070 loops=1) Index Cond: (audit_timestamp > '2011-02-19 13:05:00'::timestamp without time zone)
 Total runtime: 63007.358 ms


Cheers

Mark

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux