Search Postgresql Archives

Slow query plan used

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have a question concerning the query planner. I observe that chosen query plan differs on length and content of a like search expression.
We have a view combining data from two tables, both containing same number of rows (round about 30000). Used PostgreSQL version is 9.3.15 on Windows.
DDL of tables and view see below.

Query is:
select ID, OWNERID, FOLDER, RCVSERVICE, STATUS, TASKSTATUS, CREATED
    from DOCUMENTDATA 
    where  FOLDER in ('*INBOX','*DELAYED')  and 
           ARCHIVED='0' and 
           ( lower(subject) like lower('%Sehr%')  or  lower(mailContent) like lower('%Sehr%') ) and 
           UMR_ACTUALTOPICID in ('f3fb345741000000','8048405641000000','4fc81b5541000000','d27d9c4d41000200','e9aba54d41000000','4aaf905441000a00','737e9c4d41000900',
                                 '4aaf905441000800','3ecdec4d41000000','4aaf905441000e00','fc7e9c4d41000f00','4aaf905441000c00','11ffc54f41000000','4aaf905441000200')
                                 and 
           OWNERID in ('5000239','5000238','5000234','5000113','5000237','5000236')
    order by CREATED desc, ID desc limit 150 offset 0

Only 130 rows out of the 30000 have ARCHIVED = 0 
Chosen query plan is following:

Limit  (cost=22738.95..22739.00 rows=20 width=664) (actual time=13929.849..13929.869 rows=98 loops=1)
  Buffers: shared hit=221263 read=45723
  ->  Sort  (cost=22738.95..22739.00 rows=20 width=664) (actual time=13929.848..13929.863 rows=98 loops=1)
        Sort Key: c3k_document.created, c3k_document.id
        Sort Method: quicksort  Memory: 87kB
        Buffers: shared hit=221263 read=45723
        ->  Nested Loop  (cost=0.42..22738.52 rows=20 width=664) (actual time=95.508..13929.478 rows=98 loops=1)
              Buffers: shared hit=221263 read=45723
              ->  Seq Scan on c3k_document_index  (cost=0.00..15160.48 rows=1063 width=285) (actual time=0.206..13539.353 rows=33022 loops=1)
                    Filter: ((lower((searchfield1)::text) ~~ '%sehr%'::text) OR (lower(searchfield8) ~~ '%sehr%'::text))
                    Rows Removed by Filter: 33832
                    Buffers: shared hit=101130 read=33463
              ->  Index Scan using c3k_docume_6720023941 on c3k_document  (cost=0.42..7.12 rows=1 width=387) (actual time=0.011..0.011 rows=0 loops=33022)
                    Index Cond: (id = c3k_document_index.documentid)
                    Filter: (((folder)::text = ANY ('{*INBOX,*DELAYED}'::text[])) AND (archived = '0'::bpchar) AND ((umr_actualtopicid)::text = ANY ('{f3fb345741000000,8048405641000000,4fc81b5541000000,d27d9c4d41000200,e9aba54d41000000,4aaf905441000a00,737 (...)
                    Rows Removed by Filter: 1
                    Buffers: shared hit=120133 read=12260
Total runtime: 13930.186 ms

If the like expression is lower('%%Sehr%%')  for fields subject and mailContent, a much faster query plan is chosen:

Limit  (cost=24018.18..24018.42 rows=97 width=664) (actual time=61.110..61.130 rows=98 loops=1)
  Buffers: shared hit=1961 read=598
  ->  Sort  (cost=24018.18..24018.42 rows=97 width=664) (actual time=61.109..61.122 rows=98 loops=1)
        Sort Key: c3k_document.created, c3k_document.id
        Sort Method: quicksort  Memory: 87kB
        Buffers: shared hit=1961 read=598
        ->  Nested Loop  (cost=515.26..24014.98 rows=97 width=664) (actual time=5.193..60.851 rows=98 loops=1)
              Buffers: shared hit=1961 read=598
              ->  Bitmap Heap Scan on c3k_document  (cost=514.96..16137.34 rows=1232 width=387) (actual time=2.137..10.754 rows=282 loops=1)
                    Recheck Cond: (((folder)::text = ANY ('{*INBOX,*DELAYED}'::text[])) AND (archived = '0'::bpchar))
                    Filter: (((umr_actualtopicid)::text = ANY ('{f3fb345741000000,8048405641000000,4fc81b5541000000,d27d9c4d41000200,e9aba54d41000000,4aaf905441000a00,737e9c4d41000900,4aaf905441000800,3ecdec4d41000000,4aaf905441000e00,fc7e9c4d41000f00,11ff (...)
                    Rows Removed by Filter: 23
                    Buffers: shared hit=828 read=356
                    ->  Bitmap Index Scan on c3k_document_folder_archived_umr_orgtopicid_idx  (cost=0.00..514.66 rows=6183 width=0) (actual time=1.946..1.946 rows=2847 loops=1)
                          Index Cond: (((folder)::text = ANY ('{*INBOX,*DELAYED}'::text[])) AND (archived = '0'::bpchar))
                          Buffers: shared hit=7 read=113
              ->  Index Scan using c3k_docume_7965268402 on c3k_document_index  (cost=0.29..6.38 rows=1 width=285) (actual time=0.176..0.176 rows=0 loops=282)
                    Index Cond: (documentid = c3k_document.id)
                    Filter: ((lower((searchfield1)::text) ~~ '%%sehr%%'::text) OR (lower(searchfield8) ~~ '%%sehr%%'::text))
                    Rows Removed by Filter: 1
                    Buffers: shared hit=1133 read=242
Total runtime: 61.385 ms

But when the like expression is only a few characters longer the first - slower - query plan is used again. I thought as there's such a strong restriction by field ARCHIVED always an index containing that field would be used. VACUUM ANALYZE and REINDEX had no influence on the query plan.
Can you please give me a hint why the behavior is that way and if there's a possibility to force the use of the faster query plan.

Thanks,
Juergen

CREATE OR REPLACE VIEW c3k_documentdata AS 
 SELECT document.id,
    document.ownerid,
    document.folder,
    document.doccomment,
    document.rcvservice,
    document.status,
    document.taskstatus,
    document.created,
    document.archived,
    document.frominfo,
    document.ticketid,
    document.actualtopicid,
    document.orgtopicid,
    document_index.searchfield1 AS subject,
    document_index.searchfield2 AS fromaddress,
    document_index.searchfield3 AS toaddress,
    document_index.searchfield4 AS ccaddress,
    document_index.searchfield5 AS bccaddress,
    document_index.searchfield6 AS replytoaddress,
    document_index.searchfield7 AS hasattachment,
    document_index.searchfield8 AS mailcontent,
    document_index.searchfield9 AS mustnotautoreply,
    document_index.searchfield10 AS returnpath,
    document_index.searchfield11 AS stateweight,
    document_index.searchfield12 AS fromaddressextracted,
    document_index.searchfield13 AS sensitivity,
    document_index.searchfield14 AS priority,
    document.docfile
   FROM document
     LEFT JOIN document_index ON document.id = document_index.documentid;

CREATE TABLE document
(
  id bigserial NOT NULL,
  ownerid character(49),
  folder character varying(49),
  created timestamp without time zone NOT NULL,
  touched timestamp without time zone NOT NULL,
  status character varying(31),
  frominfo character varying(255),
  rcvservice character varying(15),
  doccomment character varying(255),
  docfile bytea,
  taskstatus character varying(31) DEFAULT 'INACTIVE'::character varying,
  archived character(1) NOT NULL DEFAULT '0'::bpchar,
  ticketid character varying(25) NOT NULL DEFAULT ''::character varying,
  orgtopicid character varying(25) NOT NULL DEFAULT ''::character varying,
  actualtopicid character varying(25) NOT NULL DEFAULT ''::character varying,
  CONSTRAINT c3k_docume_6720023941 PRIMARY KEY (id)
);

CREATE INDEX c3k_document_archived_ticketid_idx
  ON c3k_document
  USING btree
  (archived COLLATE pg_catalog."default", ticketid COLLATE pg_catalog."default");

CREATE INDEX c3k_document_folder_archived_orgtopicid_idx
  ON c3k_document
  USING btree
  (folder COLLATE pg_catalog."default", archived COLLATE pg_catalog."default", orgtopicid COLLATE pg_catalog."default");

CREATE INDEX c3k_document_ownerid_folder_created_idx
  ON c3k_document
  USING btree
  (ownerid COLLATE pg_catalog."default", folder COLLATE pg_catalog."default", created);

CREATE INDEX c3k_document_ownerid_folder_status_idx
  ON c3k_document
  USING btree
  (ownerid COLLATE pg_catalog."default", folder COLLATE pg_catalog."default", status COLLATE pg_catalog."default");


  
CREATE TABLE document_index
(
  documentid bigint NOT NULL,
  searchfield1 character varying(255),
  searchfield2 character varying(255),
  searchfield3 character varying(255),
  searchfield4 character varying(255),
  searchfield5 character varying(255),
  searchfield6 character varying(255),
  searchfield7 character varying(255),
  searchfield8 text,
  searchfield9 character varying(255),
  searchfield10 character varying(255),
  searchfield11 character varying(255),
  searchfield12 character varying(255),
  searchfield13 character varying(255) DEFAULT '0'::character varying,
  searchfield14 character varying(255) DEFAULT '1'::character varying,
  searchfield15 character varying(255),
  searchfield16 text,
  CONSTRAINT docume_7965268402 PRIMARY KEY (documentid),
  CONSTRAINT docindex_docid_fk FOREIGN KEY (documentid)
      REFERENCES document (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE CASCADE
);



-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux