Re: Query taking long time

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

Thankyou for your answer.
I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in order to avoid the join clause. Now the schema is very simple. The query only implies one table:

x=> \d+ entity_compounddict2document;
                      Table "public.entity_compounddict2document"
      Column      |              Type              | Modifiers | Storage  | Description 
------------------+--------------------------------+-----------+----------+-------------
 id               | integer                        | not null  | plain    | 
 document_id      | integer                        |           | plain    | 
 name             | character varying(255)         |           | extended | 
 qualifier        | character varying(255)         |           | extended | 
 tagMethod        | character varying(255)         |           | extended | 
 created          | timestamp(0) without time zone |           | plain    | 
 updated          | timestamp(0) without time zone |           | plain    | 
 curation         | integer                        |           | plain    | 
 hepval           | double precision               |           | plain    | 
 cardval          | double precision               |           | plain    | 
 nephval          | double precision               |           | plain    | 
 phosval          | double precision               |           | plain    | 
 patternCount     | double precision               |           | plain    | 
 ruleScore        | double precision               |           | plain    | 
 hepTermNormScore | double precision               |           | plain    | 
 hepTermVarScore  | double precision               |           | plain    | 
Indexes:
    "entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
    "entity_compound2document_cardval" btree (cardval)
    "entity_compound2document_heptermnormscore" btree ("hepTermNormScore")
    "entity_compound2document_heptermvarscore" btree ("hepTermVarScore")
    "entity_compound2document_hepval" btree (hepval)
    "entity_compound2document_name" btree (name)
    "entity_compound2document_nephval" btree (nephval)
    "entity_compound2document_patterncount" btree ("patternCount")
    "entity_compound2document_phosval" btree (phosval)
    "entity_compound2document_rulescore" btree ("ruleScore")
Has OIDs: no

           tablename            |                   indexname                    |  num_rows   | table_size | index_size | unique | number_of_scans | tuples_read | tuples_fetched 
 entity_compounddict2document   | entity_compound2document_cardval               | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |           0 |              0
 entity_compounddict2document   | entity_compound2document_heptermnormscore      | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |           0 |              0
 entity_compounddict2document   | entity_compound2document_heptermvarscore       | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |           0 |              0
 entity_compounddict2document   | entity_compound2document_hepval                | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |           0 |              0
 entity_compounddict2document   | entity_compound2document_name                  | 5.42452e+07 | 6763 MB    | 1505 MB    | Y      |              24 |      178680 |              0
 entity_compounddict2document   | entity_compound2document_nephval               | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |           0 |              0
 entity_compounddict2document   | entity_compound2document_patterncount          | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |           0 |              0
 entity_compounddict2document   | entity_compound2document_phosval               | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |           0 |              0
 entity_compounddict2document   | entity_compound2document_rulescore             | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |           0 |              0
 entity_compounddict2document   | entity_compounddict2document_pkey              | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |           0 |              0

The table has aprox. 54,000,000 rows
There are no NULLs in hepval field and pg_settings haven't changed. I also have done "analyze" to this table.

I have simplified the query and added the last advise that you told me:

Query: 

 explain analyze select * from (select * from entity_compounddict2document  where name='ranitidine') as a order by a.hepval;
                                                                      QUERY PLAN                                                                      
------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=11060.50..11067.55 rows=2822 width=133) (actual time=32715.097..32716.488 rows=13512 loops=1)
   Sort Key: entity_compounddict2document.hepval
   Sort Method:  quicksort  Memory: 2301kB
   ->  Bitmap Heap Scan on entity_compounddict2document  (cost=73.82..10898.76 rows=2822 width=133) (actual time=6.034..32695.483 rows=13512 loops=1)
         Recheck Cond: ((name)::text = 'ranitidine'::text)
         ->  Bitmap Index Scan on entity_compound2document_name  (cost=0.00..73.12 rows=2822 width=0) (actual time=3.221..3.221 rows=13512 loops=1)
               Index Cond: ((name)::text = 'ranitidine'::text)
 Total runtime: 32717.548 ms


Another query:
explain analyze select * from (select * from entity_compounddict2document  where name='progesterone' ) as a  order by a.hepval;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=367879.25..368209.24 rows=131997 width=133) (actual time=9262.887..9287.046 rows=138165 loops=1)
   Sort Key: entity_compounddict2document.hepval
   Sort Method:  quicksort  Memory: 25622kB
   ->  Bitmap Heap Scan on entity_compounddict2document  (cost=2906.93..356652.81 rows=131997 width=133) (actual time=76.316..9038.485 rows=138165 loops=1)
         Recheck Cond: ((name)::text = 'progesterone'::text)
         ->  Bitmap Index Scan on entity_compound2document_name  (cost=0.00..2873.93 rows=131997 width=0) (actual time=40.913..40.913 rows=138165 loops=1)
               Index Cond: ((name)::text = 'progesterone'::text)
 Total runtime: 9296.815 ms


It has improved (I supose because of the lack of the join table) but still taking a lot of time... Anything I can do??

Any help would be very appreciated. Thank you very much

Andrés.


El Mar 3, 2014, a las 1:04 AM, Venkata Balaji Nagothi escribió:

Any Re-Indexing was done recently ?

If the SELECT query without ORDER BY is showing low cost, then, the query can be re-written as below to see if the performance improves. If the resultant rows of the query are 

select * from (select query without order by clause) a order by a.hepval and so on -- something like that.

This should lower the cost of the query because the planner chooses to sort on the resultant set of the rows rather than sorting the table and getting the results.

Please let us know if this helps !

Venkata Balaji N

Sr. Database Administrator
Fujitsu Australia


On Fri, Feb 28, 2014 at 8:55 PM, acanada <acanada@xxxxxxx> wrote:
Thankyou for your answer!

Sizes of Tables and Indexes are:

            relname             | rows_in_bytes |  num_rows   | number_of_indexes | unique | single_column | multi_column 
--------------------------------+---------------+-------------+-------------------+--------+---------------+--------------
 documentold                    | 119 MB        | 1.24516e+08 |                12 | Y      |            12 |            0
 entity2document                | 89 MB         | 9.33666e+07 |                 5 | Y      |             4 |            1

           tablename            |                   indexname                    |  num_rows   | table_size | index_size | unique | number_of_scans | tuples_read | tuples_fetched 
--------------------------------+------------------------------------------------+-------------+------------+------------+--------+-----------------+-------------+----------------
 documentold                    | DocumentOLD_pkey                               | 1.24516e+08 | 33 GB      | 2708 MB    | Y      |        45812802 |   924462741 |      924084958
 documentold                    | document_cardval_index                         | 1.24516e+08 | 33 GB      | 2763 MB    | Y      |               0 |           0 |              0
 documentold                    | document_heptermnorm_index                     | 1.24516e+08 | 33 GB      | 2667 MB    | Y      |               0 |           0 |              0
 documentold                    | document_heptermvar_index                      | 1.24516e+08 | 33 GB      | 2667 MB    | Y      |               0 |           0 |              0
 documentold                    | document_hepval_index                          | 1.24516e+08 | 33 GB      | 2667 MB    | Y      |               0 |           0 |              0
 documentold                    | document_kind_index                            | 1.24516e+08 | 33 GB      | 2859 MB    | Y      |               0 |           0 |              0
 documentold                    | document_nephval_index                         | 1.24516e+08 | 33 GB      | 2667 MB    | Y      |               0 |           0 |              0
 documentold                    | document_patterncount_index                    | 1.24516e+08 | 33 GB      | 2667 MB    | Y      |               0 |           0 |              0
 documentold                    | document_phosval_index                         | 1.24516e+08 | 33 GB      | 2667 MB    | Y      |               0 |           0 |              0
 documentold                    | document_rulescore_index                       | 1.24516e+08 | 33 GB      | 2667 MB    | Y      |               0 |           0 |              0
 documentold                    | document_sentenceid_index                      | 1.24516e+08 | 33 GB      | 3867 MB    | Y      |         8089466 |    12669585 |        7597923
 documentold                    | document_uid_index                             | 1.24516e+08 | 33 GB      | 3889 MB    | Y      |               0 |           0 |              0
 entity2document                | Entity2Document_pkey                           | 9.33666e+07 | 7216 MB    | 2000 MB    | Y      |            2942 |        2942 |           2942
 entity2document                | document_qualifier_name_index                  | 9.33666e+07 | 7216 MB    | 3557 MB    | Y      |              93 |     1091680 |         124525
 entity2document                | entity2Document_name_index                     | 9.33666e+07 | 7216 MB    | 2550 MB    | Y      |            4330 |     3320634 |              2
 entity2document                | idx_a6020c0dc33f7837                           | 9.33666e+07 | 7216 MB    | 2000 MB    | Y      |         2465927 |     1661666 |        1661666
 entity2document                | qualifier_index                                | 9.33666e+07 | 7216 MB    | 2469 MB    | Y      |              51 |  2333120186 |              0

The explain plan shows lower cost without order by!!
There are no NULLs in the hepval field...


Thank you for your time!

Andrés



El Feb 28, 2014, a las 2:28 AM, Venkata Balaji Nagothi escribió:

Hi Andres,

Can you please help us with the below information.

- Sizes of Tables and Indexes
- The Explain plan shows same/higher cost without ORDER BY clause ?

I suspect huge number of NULLs might be the problem. If you can please get us the above information, then we can probably know if the cost is genuine.

Venkata Balaji N

Sr. Database Administrator
Fujitsu Australia



On Thu, Feb 27, 2014 at 9:31 PM, acanada <acanada@xxxxxxx> wrote:
Hello,

I have changed the multicolumn index from:
        "entity2document_name_qualifier_index" btree (name, qualifier)
 to:
         "document_qualifier_name_index" btree (qualifier, name)

And now the planner doesn't "Recheck cond:"  (there are only three different qualifiers vs. millions of names...)

But still taking long time


Limit  (cost=384043.64..384043.66 rows=10 width=313) (actual time=80555.930..80555.934 rows=10 loops=1)
   ->  Sort  (cost=384043.64..384081.19 rows=15020 width=313) (actual time=80555.928..80555.931 rows=10 loops=1)
         Sort Key: d1_.hepval
         Sort Method:  top-N heapsort  Memory: 29kB
         ->  Nested Loop  (cost=0.00..383719.06 rows=15020 width=313) (actual time=223.778..80547.196 rows=3170 loops=1)
               ->  Index Scan using document_qualifier_name_index on entity2document e0_  (cost=0.00..52505.40 rows=15020 width=59) (actual time=126.880..11549.392 rows=3170 loops=1)
                     Index Cond: (((qualifier)::text = 'CompoundDict'::text) AND ((name)::text = 'galactosamine'::text))
               ->  Index Scan using "DocumentOLD_pkey" on documentold d1_  (cost=0.00..22.04 rows=1 width=254) (actual time=21.747..21.764 rows=1 loops=3170)
                     Index Cond: (d1_.id = e0_.document_id)
                     Filter: (d1_.hepval IS NOT NULL)
 Total runtime: 80556.027 ms



Any help/point to any direction, would be very appreciated.
Thank you,
Andrés

El Feb 26, 2014, a las 4:41 PM, acanada escribió:

> Hello to everybody and thanks in advance to take a look to this message.
> I'm new in this list and with PostgreSQL.
> My queries are taking too much time to complete and I don't know what to do right now. I think I'm providing all  the info required for you to help me. If you need extra info please tell me.
>
> I am using DQL included in the last version of symfony2 (2.4.2). This is the query, formed by DQL, but coppied-pasted to the psql client (9.1.11, server 8.3.8)
>
> explain analyze SELECT e0_.id AS id0, e0_.name AS name1, e0_.qualifier AS qualifier2, e0_."tagMethod" AS tagmethod3, e0_.curation AS curation4, e0_.created AS created5, e0_.updated AS updated6, d1_.id AS id7, d1_.kind AS kind8, d1_.uid AS uid9, d1_."sentenceId" AS sentenceid10, d1_.text AS text11, d1_.hepval AS hepval12, d1_.cardval AS cardval13, d1_.nephval AS nephval14, d1_.phosval AS phosval15, d1_."patternCount" AS patterncount16, d1_."ruleScore" AS rulescore17, d1_."hepTermNormScore" AS heptermnormscore18, d1_."hepTermVarScore" AS heptermvarscore19, d1_.created AS created20, d1_.updated AS updated21, e0_.document_id AS document_id22 FROM Entity2Document e0_ INNER JOIN documentold d1_ ON e0_.document_id = d1_.id WHERE e0_.name ='ranitidine' AND e0_.qualifier = 'CompoundDict' AND d1_.hepval IS NOT NULL ORDER BY d1_.hepval DESC limit 10;
>
>
> limtox=> \d+ documentold;
>                               Table "public.documentold"
>      Column      |              Type              | Modifiers | Storage  | Description
> ------------------+--------------------------------+-----------+----------+-------------
> id               | integer                        | not null  | plain    |
> kind             | character varying(255)         | not null  | extended |
> uid              | character varying(255)         | not null  | extended |
> sentenceId       | character varying(255)         | not null  | extended |
> text             | text                           | not null  | extended |
> hepval           | double precision               |           | plain    |
> created          | timestamp(0) without time zone | not null  | plain    |
> updated          | timestamp(0) without time zone |           | plain    |
> cardval          | double precision               |           | plain    |
> nephval          | double precision               |           | plain    |
> phosval          | double precision               |           | plain    |
> patternCount     | double precision               |           | plain    |
> ruleScore        | double precision               |           | plain    |
> hepTermNormScore | double precision               |           | plain    |
> hepTermVarScore  | double precision               |           | plain    |
> Indexes:
>    "DocumentOLD_pkey" PRIMARY KEY, btree (id)
>    "document_cardval_index" btree (cardval)
>    "document_heptermnorm_index" btree ("hepTermNormScore" DESC NULLS LAST)
>    "document_heptermvar_index" btree ("hepTermVarScore" DESC NULLS LAST)
>    "document_hepval_index" btree (hepval DESC NULLS LAST)
>    "document_kind_index" btree (kind)
>    "document_nephval_index" btree (nephval DESC NULLS LAST)
>    "document_patterncount_index" btree ("patternCount" DESC NULLS LAST)
>    "document_phosval_index" btree (phosval DESC NULLS LAST)
>    "document_rulescore_index" btree ("ruleScore" DESC NULLS LAST)
>    "document_sentenceid_index" btree ("sentenceId")
>    "document_uid_index" btree (uid)
> Referenced by:
>    TABLE "hepkeywordtermnorm2document" CONSTRAINT "fk_1c19bcd0c33f7837" FOREIGN KEY (document_id) REFERENCES documentold(id)
>    TABLE "cytochrome2document" CONSTRAINT "fk_21f7636fc33f7837" FOREIGN KEY (document_id) REFERENCES documentold(id)
>    TABLE "hepkeywordtermvariant2document" CONSTRAINT "fk_a316e36bc33f7837" FOREIGN KEY (document_id) REFERENCES documentold(id)
>    TABLE "entity2document" CONSTRAINT "fk_a6020c0dc33f7837" FOREIGN KEY (document_id) REFERENCES documentold(id)
>    TABLE "specie2document" CONSTRAINT "fk_b6e551c8c33f7837" FOREIGN KEY (document_id) REFERENCES documentold(id)
> Has OIDs: no
>
>
>
>
> limtox=> \d+ entity2document;                                     Table "public.entity2document"   Column    |              Type              |            Modifiers            | Storage  | Description -------------+--------------------------------+---------------------------------+----------+------------- id          | integer                        | not null                        | plain    |
> document_id | integer                        |                                 | plain    |
> name        | character varying(255)         | not null                        | extended |
> qualifier   | character varying(255)         | not null                        | extended |
> tagMethod   | character varying(255)         | default NULL::character varying | extended |
> created     | timestamp(0) without time zone | not null                        | plain    |
> updated     | timestamp(0) without time zone |                                 | plain    |
> curation    | integer                        |                                 | plain    |
> Indexes:
>    "Entity2Document_pkey" PRIMARY KEY, btree (id)
>    "entity2Document_name_index" btree (name)
>    "entity2document_name_qualifier_index" btree (name, qualifier)
>    "idx_a6020c0dc33f7837" btree (document_id)
>    "qualifier_index" btree (qualifier)
> Foreign-key constraints:
>    "fk_a6020c0dc33f7837" FOREIGN KEY (document_id) REFERENCES documentold(id)
> Has OIDs: no
>
>
>
>
>
>
> Table metadata:
>    documentold: 124.515.592 of rows. It has several columns with a large proportion of NULLs(updated, patternCount, ruleScore, hepTermNormScore, hepTermVarScore)
>    entity2document: 93.785.968 of rows. It has two columns with a large proportion of NULLs (updated, curation)
>
> None of the tables receive updates or deletes regularly
>
>
>                                                                                QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit  (cost=387929.02..387929.05 rows=10 width=313) (actual time=55980.472..55980.476 rows=10 loops=1)
>   ->  Sort  (cost=387929.02..387966.75 rows=15090 width=313) (actual time=55980.471..55980.473 rows=10 loops=1)
>         Sort Key: d1_.hepval
>         Sort Method:  top-N heapsort  Memory: 28kB
>         ->  Nested Loop  (cost=469.14..387602.93 rows=15090 width=313) (actual time=96.716..55974.004 rows=2774 loops=1)
>               ->  Bitmap Heap Scan on entity2document e0_  (cost=469.14..54851.25 rows=15090 width=59) (actual time=51.299..8452.592 rows=2774 loops=1)
>                     Recheck Cond: (((name)::text = 'Cimetidine'::text) AND ((qualifier)::text = 'CompoundDict'::text))
>                     ->  Bitmap Index Scan on entity2document_name_qualifier_index  (cost=0.00..465.36 rows=15090 width=0) (actual time=36.467..36.467 rows=2774 loops=1)
>                           Index Cond: (((name)::text = 'Cimetidine'::text) AND ((qualifier)::text = 'CompoundDict'::text))
>               ->  Index Scan using "DocumentOLD_pkey" on documentold d1_  (cost=0.00..22.04 rows=1 width=254) (actual time=17.113..17.129 rows=1 loops=2774)
>                     Index Cond: (d1_.id = e0_.document_id)
>                     Filter: (d1_.hepval IS NOT NULL)
> Total runtime: 55980.554 ms
> (13 rows)
>
> version
> -----------------------------------------------------------------------------------------------------
> PostgreSQL 8.3.8 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.4.real (Ubuntu 10.4.1-3ubuntu3) 10.4.1
>
> This query has been always slow. It's fast only when it's cached. Vacuum and analyze have been done manually very recently
>
>
>
>
> SELECT name, current_setting(name), source
>  FROM pg_settings
>  WHERE source NOT IN ('default', 'override');
>
>             name            |  current_setting   |        source
> ----------------------------+--------------------+----------------------
> client_encoding            | UTF8               | client
> DateStyle                  | ISO, DMY           | configuration file
> default_text_search_config | pg_catalog.spanish | configuration file
> effective_cache_size       | 7500MB             | configuration file
> lc_messages                | es_ES.UTF-8        | configuration file
> lc_monetary                | es_ES.UTF-8        | configuration file
> lc_numeric                 | C                  | configuration file
> lc_time                    | es_ES.UTF-8        | configuration file
> listen_addresses           | *                  | configuration file
> log_line_prefix            | %t                 | configuration file
> log_timezone               | localtime          | command line
> maintenance_work_mem       | 2000MB             | configuration file
> max_connections            | 100                | configuration file
> max_fsm_pages              | 63217760           | configuration file
> max_stack_depth            | 2MB                | environment variable
> port                       | 5432               | configuration file
> shared_buffers             | 1500MB             | configuration file
> ssl                        | on                 | configuration file
> tcp_keepalives_count       | 9                  | configuration file
> tcp_keepalives_idle        | 7200               | configuration file
> tcp_keepalives_interval    | 75                 | configuration file
> TimeZone                   | localtime          | command line
> timezone_abbreviations     | Default            | command line
> work_mem                   | 50MB               | configuration file
>
> Setting the work_mem to 3000MB doesn't change anything...
>
> Everything seems good to me but the Recheck Cond, because of the large ammount of rows, is slowing the query too much. I have read that is not a good point to try to get rid of recheck cond (maybe even not possible, I don't know, I'm new to PostgreSQL). I'd like to know what I am doing wrong and how can I solve it...
>
> Any help please?
>
> Thank you very much,
>
> Andrés
> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.
> **CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.
**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.



--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance




**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.

**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.




**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.

**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux