Re: Query taking long time

Venkata Balaji Nagothi <vbnpgc@xxxxxxxxx> · Tue, 4 Mar 2014 10:28:22 +1100

On Mon, Mar 3, 2014 at 9:17 PM, acanada <acanada@xxxxxxx> wrote:

Hello,
Thankyou for your answer.

I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in order to avoid the join clause. Now the schema is very simple. The query only implies one table:

x=> \d+ entity_compounddict2document;
                      Table "public.entity_compounddict2document"
      Column      |              Type              | Modifiers | Storage  | Description 

------------------+--------------------------------+-----------+----------+-------------
 id               | integer                        | not null  | plain    | 
 document_id      | integer                        |           | plain    | 

 name             | character varying(255)         |           | extended | 
 qualifier        | character varying(255)         |           | extended | 
 tagMethod        | character varying(255)         |           | extended | 

 created          | timestamp(0) without time zone |           | plain    | 
 updated          | timestamp(0) without time zone |           | plain    | 
 curation         | integer                        |           | plain    | 

 hepval           | double precision               |           | plain    | 
 cardval          | double precision               |           | plain    | 
 nephval          | double precision               |           | plain    | 

 phosval          | double precision               |           | plain    | 
 patternCount     | double precision               |           | plain    | 
 ruleScore        | double precision               |           | plain    | 

 hepTermNormScore | double precision               |           | plain    | 
 hepTermVarScore  | double precision               |           | plain    | 
Indexes:
    "entity_compounddict2document_pkey" PRIMARY KEY, btree (id)

    "entity_compound2document_cardval" btree (cardval)
    "entity_compound2document_heptermnormscore" btree ("hepTermNormScore")
    "entity_compound2document_heptermvarscore" btree ("hepTermVarScore")

    "entity_compound2document_hepval" btree (hepval)
    "entity_compound2document_name" btree (name)
    "entity_compound2document_nephval" btree (nephval)

    "entity_compound2document_patterncount" btree ("patternCount")
    "entity_compound2document_phosval" btree (phosval)
    "entity_compound2document_rulescore" btree ("ruleScore")

Has OIDs: no

           tablename            |                   indexname                                              |  num_rows    | table_size  | index_size | unique | number_of_scans | tuples_read | tuples_fetched 

 entity_compounddict2document   | entity_compound2document_cardval               | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |           0 |              0
 entity_compounddict2document   | entity_compound2document_heptermnormscore      | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |           0 |              0

 entity_compounddict2document   | entity_compound2document_heptermvarscore       | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |           0 |              0
 entity_compounddict2document   | entity_compound2document_hepval                | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |           0 |              0

 entity_compounddict2document   | entity_compound2document_name                  | 5.42452e+07 | 6763 MB    | 1505 MB    | Y      |              24 |      178680 |              0
 entity_compounddict2document   | entity_compound2document_nephval               | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |           0 |              0

 entity_compounddict2document   | entity_compound2document_patterncount          | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |           0 |              0
 entity_compounddict2document   | entity_compound2document_phosval               | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |           0 |              0

 entity_compounddict2document   | entity_compound2document_rulescore             | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |           0 |              0
 entity_compounddict2document   | entity_compounddict2document_pkey              | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |           0 |              0

The table has aprox. 54,000,000 rows
There are no NULLs in hepval field and pg_settings haven't changed. I also have done "analyze" to this table.

I have simplified the query and added the last advise that you told me:

Query: 

	 explain analyze select * from (select * from entity_compounddict2document  where name='ranitidine') as a order by a.hepval;

                                                                      QUERY PLAN                                                                      
------------------------------------------------------------------------------------------------------------------------------------------------------

 Sort  (cost=11060.50..11067.55 rows=2822 width=133) (actual time=32715.097..32716.488 rows=13512 loops=1)
   Sort Key: entity_compounddict2document.hepval
   Sort Method:  quicksort  Memory: 2301kB

   ->  Bitmap Heap Scan on entity_compounddict2document  (cost=73.82..10898.76 rows=2822 width=133) (actual time=6.034..32695.483 rows=13512 loops=1)
         Recheck Cond: ((name)::text = 'ranitidine'::text)

         ->  Bitmap Index Scan on entity_compound2document_name  (cost=0.00..73.12 rows=2822 width=0) (actual time=3.221..3.221 rows=13512 loops=1)
               Index Cond: ((name)::text = 'ranitidine'::text)

 Total runtime: 32717.548 ms

Another query:

	explain analyze select * from (select * from entity_compounddict2document  where name='progesterone' ) as a  order by a.hepval;

  						              QUERY PLAN  
------------------------------------------------------------------------------------------------------------------------------------------------------------

 Sort  (cost=367879.25..368209.24 rows=131997 width=133) (actual time=9262.887..9287.046 rows=138165 loops=1)
   Sort Key: entity_compounddict2document.hepval
   Sort Method:  quicksort  Memory: 25622kB

   ->  Bitmap Heap Scan on entity_compounddict2document  (cost=2906.93..356652.81 rows=131997 width=133) (actual time=76.316..9038.485 rows=138165 loops=1)
         Recheck Cond: ((name)::text = 'progesterone'::text)

         ->  Bitmap Index Scan on entity_compound2document_name  (cost=0.00..2873.93 rows=131997 width=0) (actual time=40.913..40.913 rows=138165 loops=1)
               Index Cond: ((name)::text = 'progesterone'::text)

 Total runtime: 9296.815 ms

It has improved (I supose because of the lack of the join table) but still taking a lot of time... Anything I can do??

Any help would be very appreciated. Thank you very much.

Good to know performance has increased.

"entity_compounddict2document" table goes through high INSERTS ?

Can you help us know if the "helpval" column and "name" column have high duplicate values ? "n_distinct" value from pg_stats table would have that info. 

Below could be a possible workaround -

As mentioned earlier in this email, a composite Index on name and hepval column might help. If the table does not go through lot of INSERTS, then consider performing a CLUSTER on the table using the same INDEX.

Other recommendations -

Please drop off all the Non-primary key Indexes which have 0 scans / hits. This would harm the DB and the DB server whilst maintenance and DML operations.

Regards,
Venkata Balaji N

Fujitsu Australia