very slow left join

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've inherited an Oracle database that I'm porting to Postgres, and this has been going quite well until now. Unfortunately, I've found one view (a largish left join) that runs several orders of magnitude slower on Postgres than it did on Oracle.

=> select version();
                                                 version
----------------------------------------------------------------------------------------------------------
 PostgreSQL 8.2.4 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.1 20070105 (Red Hat 4.1.1-52)
(1 row)


After analyzing the database, the explain analyze output for the query is:

 Nested Loop Left Join  (cost=133.51..15846.99 rows=1 width=312) (actual time=109.131..550711.374 rows=1248 loops=1)
   Join Filter: (log.logkey = ln.logkey)
   ->  Nested Loop  (cost=133.51..267.44 rows=1 width=306) (actual time=15.316..74.074 rows=1248 loops=1)
         ->  Merge Join  (cost=133.51..267.16 rows=1 width=325) (actual time=15.300..60.332 rows=1248 loops=1)
               Merge Cond: (log.eventkey = e.eventkey)
               Join Filter: ((e.clientkey = log.clientkey) AND (e.premiseskey = log.premiseskey))
               ->  Index Scan using log_eventkey_idx on log  (cost=0.00..3732.14 rows=36547 width=167) (actual time=0.015..25.385 rows=36547 loops=1)
                     Filter: (logicaldel = 'N'::bpchar)
               ->  Sort  (cost=133.51..135.00 rows=595 width=328) (actual time=15.185..16.379 rows=1248 loops=1)
                     Sort Key: e.eventkey
                     ->  Hash Join  (cost=1.30..106.09 rows=595 width=328) (actual time=0.073..2.033 rows=1248 loops=1)
                           Hash Cond: ((e.clientkey = p.clientkey) AND (e.premiseskey = p.premiseskey))
                           ->  Seq Scan on event e  (cost=0.00..89.48 rows=1248 width=246) (actual time=0.005..0.481 rows=1248 loops=1)
                           ->  Hash  (cost=1.14..1.14 rows=11 width=82) (actual time=0.059..0.059 rows=11 loops=1)
                                 ->  Seq Scan on premises p  (cost=0.00..1.14 rows=11 width=82) (actual time=0.004..0.020 rows=11 loops=1)
                                       Filter: (logicaldel = 'N'::bpchar)
         ->  Index Scan using severity_pk on severity s  (cost=0.00..0.27 rows=1 width=49) (actual time=0.007..0.009 rows=1 loops=1248)
               Index Cond: (e.severitykey = s.severitykey)
   ->  Seq Scan on lognote ln1  (cost=0.00..15552.67 rows=1195 width=175) (actual time=1.173..440.695 rows=1244 loops=1248)
         Filter: ((logicaldel = 'N'::bpchar) AND (subplan))
         SubPlan
           ->  Limit  (cost=4.30..8.58 rows=1 width=34) (actual time=0.171..0.171 rows=1 loops=2982720)
                 InitPlan
                   ->  GroupAggregate  (cost=0.00..4.30 rows=1 width=110) (actual time=0.089..0.089 rows=1 loops=2982720)
                         ->  Index Scan using lognote_pk on lognote (cost=0.00..4.28 rows=1 width=110) (actual time=0.086..0.087 rows=1 loops=2982720)
                               Index Cond: ((clientkey = $0) AND (premiseskey = $1) AND (logkey = $2))
                               Filter: ((logicaldel = 'N'::bpchar) AND ((lognotetext ~~ '_%;%'::text) OR (lognotetext ~~ '_%has modified Respond Status to%'::text)))
                 ->  Index Scan using lognote_pk on lognote  (cost=0.00..4.28 rows=1 width=34) (actual time=0.170..0.170 rows=1 loops=2982720)
                       Index Cond: ((clientkey = $0) AND (premiseskey = $1) AND (logkey = $2))
                       Filter: ((logicaldel = 'N'::bpchar) AND (lognotetime = $3))
 Total runtime: 550712.393 ms
(31 rows)


Either side of the left join runs quite fast independently. (The full query also runs well when made into an inner join, but that's not the logic I want.) The biggest difference between running each side indpendently and together in a left join is that this line in the plan for the right side of the left join:

->  Index Scan using lognote_pk on lognote (cost=0.00..4.28 rows=1 width=110) (actual time=0.086..0.087 rows=1 loops=2982720)

...becomes this line when run independantly:

->  Index Scan using lognote_pk on lognote  (cost=0.00..4.28 rows=1 width=110) (actual time=0.086..0.087 rows=1 loops=2390)

That's quite a few more loops in the left join. Am I right to think that it's looping so much because the analyzer is so far off when guessing the rows for the left side of the join (1 vs. 1248)? Or is there something else going on? I've tried bumping up analyze stats on a few columns, but I'm not too sure how to spot which columns it might help with and, sure enough, it didn't help.


The actual query:

select *
from
  (
    select *
    from
      event e,
      severity s,
      premises p,
      log
    where
      p.clientkey = e.clientkey and
      p.premiseskey = e.premiseskey and
      p.logicaldel = 'N' and
      log.logicaldel = 'N' and
      e.clientkey = log.clientkey and
      e.premiseskey = log.premiseskey and
      e.eventkey = log.eventkey and
      e.severitykey = s.severitykey
  ) lj
  left join
  (
    select
      clientkey, premiseskey, logkey, lognotetime, logicaldel,
    case
      when
        (case when instr(lognotetext,';') = 0 then instr(lognotetext,' has modified')
              else instr(lognotetext,';') end) = 0 then NULL
       else
           substr(lognotetext,1,
           (
               case when instr(lognotetext,';') = 0 then
                   instr(lognotetext,' has modified') else
                   instr(lognotetext,';') end
           ) - 1)
         end as responderid
       from lognote ln1
       where
         logicaldel = 'N' and
         lognotekey in
         (
           select lognotekey
           from lognote
           where
             logicaldel = 'N' and
             clientkey = ln1.clientkey and
             premiseskey = ln1.premiseskey and
             logkey = ln1.logkey and
             lognotetime =
               (
                 select min(lognotetime)
                 from lognote
                 where
                    logicaldel = 'N' and
                    (
                      lognotetext like '_%;%' or
                      lognotetext like '_%has modified Respond Status to%'
                    ) and
                    clientkey = ln1.clientkey and
                    premiseskey = ln1.premiseskey and
                    logkey = ln1.logkey
                  group by clientkey, premiseskey, logkey
                )
             order by lognotekey limit 1
            )
  ) ln on
  (
    lj.logkey = ln.logkey
  )


The instr() function calls are calling this version of instr:
http://www.postgresql.org/docs/8.2/interactive/plpgsql-porting.html#PLPGSQL-PORTING-APPENDIX



The relevent schema:

                         Table "public.event"
     Column     |            Type             |       Modifiers
----------------+-----------------------------+------------------------
 clientkey      | character(30)               | not null
 premiseskey    | character(30)               | not null
 eventkey       | character(30)               | not null
 severitykey    | character(30)               |
Indexes:
    "event_pk" PRIMARY KEY, btree (clientkey, premiseskey, eventkey), tablespace "data"
Foreign-key constraints:
    "premisesevent" FOREIGN KEY (clientkey, premiseskey) REFERENCES premises(clientkey, premiseskey) DEFERRABLE INITIALLY DEFERRED

            Table "public.severity"
       Column       |     Type      | Modifiers
--------------------+---------------+-----------
 severitykey        | character(30) | not null
 severityname       | text          |
Indexes:
    "severity_pk" PRIMARY KEY, btree (severitykey), tablespace "data"

                       Table "public.premises"
     Column      |            Type             |      Modifiers
-----------------+-----------------------------+---------------------
 clientkey       | character(30)               | not null
 premiseskey     | character(30)               | not null
 logicaldel      | character(1)                | default 'N'::bpchar
Indexes:
    "premises_pk" PRIMARY KEY, btree (clientkey, premiseskey), tablespace "data"
Foreign-key constraints:
    "clientpremises" FOREIGN KEY (clientkey) REFERENCES client(clientkey) DEFERRABLE INITIALLY DEFERRED

                         Table "public.log"
     Column     |            Type             |      Modifiers
----------------+-----------------------------+---------------------
 clientkey      | character(30)               | not null
 premiseskey    | character(30)               | not null
 logkey         | character(30)               | not null
 logicaldel     | character(1)                | default 'N'::bpchar
 eventkey       | character(30)               |
Indexes:
    "log_pk" PRIMARY KEY, btree (clientkey, premiseskey, logkey), tablespace "data"
    "log_ak1" btree (clientkey, premiseskey, logtime, logkey), tablespace "data"
    "log_eventkey_idx" btree (eventkey), tablespace "data"
Foreign-key constraints:
    "premiseslog" FOREIGN KEY (clientkey, premiseskey) REFERENCES premises(clientkey, premiseskey) DEFERRABLE INITIALLY DEFERRED

                     Table "public.lognote"
   Column    |            Type             |      Modifiers
-------------+-----------------------------+---------------------
 clientkey   | character(30)               | not null
 premiseskey | character(30)               | not null
 logkey      | character(30)               | not null
 lognotekey  | character(30)               | not null
 logicaldel  | character(1)                | default 'N'::bpchar
 lognotetext | text                        |
 lognotetime | timestamp without time zone |
Indexes:
    "lognote_pk" PRIMARY KEY, btree (clientkey, premiseskey, logkey, lognotekey), tablespace "data"
    "lognotekey_idx" UNIQUE, btree (lognotekey), tablespace "data"
Foreign-key constraints:
    "log_lognote_fk1" FOREIGN KEY (clientkey, premiseskey, logkey) REFERENCES log(clientkey, premiseskey, logkey) DEFERRABLE INITIALLY DEFERRED



Any help would be appreciated!


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux