Query much slower when run from postgres function

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have a function, looking like this:

CREATE OR REPLACE FUNCTION get_memo_display_queue_size(a_service_id integer)
  RETURNS integer AS
$BODY$
SELECT
	COUNT(*)::integer
FROM
	v_messages_memo
	LEFT JOIN messages_memo_displayed
		ON id = message_id
WHERE
	service_id = $1
	AND state = 1
	AND admin_id IS NULL;
$BODY$
  LANGUAGE 'sql' VOLATILE SECURITY DEFINER
  COST 100;

Now, when I run that function from psql, it takes around 200ms to complete:

pulitzer2=# explain analyze select get_memo_display_queue_size(1829);
QUERY PLAN
----------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=219.728..219.730 rows=1 loops=1)
 Total runtime: 219.758 ms
(2 rows)

pulitzer2=#

And it takes around 200ms each time I run the function!


When I rewrite the query so I can see queryplan, I get this:

create view _v1 as
SELECT
	*
FROM
	v_messages_memo
	LEFT JOIN messages_memo_displayed
		ON id = message_id
WHERE
	state = 1
	AND admin_id IS NULL;

pulitzer2=# EXPLAIN ANALYZE select count(*) from _v1 WHERE service_id = 1829;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=23506.14..23506.15 rows=1 width=0) (actual time=6.001..6.002 rows=1 loops=1) -> Nested Loop (cost=150.69..23505.59 rows=216 width=0) (actual time=5.744..5.971 rows=13 loops=1) -> Hash Left Join (cost=150.69..11035.16 rows=2104 width=4) (actual time=5.721..5.793 rows=13 loops=1) Hash Cond: (messages.id = messages_memo_displayed.message_id)
               Filter: (messages_memo_displayed.admin_id IS NULL)
-> Bitmap Heap Scan on messages (cost=97.03..10955.11 rows=4209 width=4) (actual time=0.042..0.075 rows=13 loops=1)
                     Recheck Cond: (service_id = 1829)
-> Bitmap Index Scan on messages_uq__service_id__tan (cost=0.00..95.98 rows=4209 width=0) (actual time=0.032..0.032 rows=13 loops=1)
                           Index Cond: (service_id = 1829)
-> Hash (cost=28.85..28.85 rows=1985 width=8) (actual time=5.666..5.666 rows=1985 loops=1) -> Seq Scan on messages_memo_displayed (cost=0.00..28.85 rows=1985 width=8) (actual time=0.009..2.697 rows=1985 loops=1) -> Index Scan using messages_memo_pk on messages_memo (cost=0.00..5.91 rows=1 width=4) (actual time=0.006..0.008 rows=1 loops=13)
               Index Cond: (messages_memo.message_id = messages.id)
               Filter: ((messages_memo.state)::integer = 1)
 Total runtime: 6.079 ms
(15 rows)


So I noticed that postgres is using seq_scan on messages_memo_displayed, although there is a PK (and an index) on message_id in messages_memo_displayed (I'll post DDL of the tables at the end of the post).

So, I tried EXPLAIN ANALYZE after I forced planner not to use sequential scans:

pulitzer2=# EXPLAIN ANALYZE select count(*) from _v1 WHERE service_id = 1829;

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=25403.60..25403.61 rows=1 width=0) (actual time=6.546..6.547 rows=1 loops=1) -> Nested Loop (cost=2048.16..25403.06 rows=216 width=0) (actual time=6.287..6.512 rows=13 loops=1) -> Hash Left Join (cost=2048.16..12932.63 rows=2104 width=4) (actual time=6.268..6.340 rows=13 loops=1) Hash Cond: (messages.id = messages_memo_displayed.message_id)
               Filter: (messages_memo_displayed.admin_id IS NULL)
-> Bitmap Heap Scan on messages (cost=97.03..10955.11 rows=4209 width=4) (actual time=0.043..0.078 rows=13 loops=1)
                     Recheck Cond: (service_id = 1829)
-> Bitmap Index Scan on messages_uq__service_id__tan (cost=0.00..95.98 rows=4209 width=0) (actual time=0.032..0.032 rows=13 loops=1)
                           Index Cond: (service_id = 1829)
-> Hash (cost=1926.31..1926.31 rows=1985 width=8) (actual time=6.211..6.211 rows=1985 loops=1) -> Index Scan using messages_memo_displayed_pk on messages_memo_displayed (cost=0.00..1926.31 rows=1985 width=8) (actual time=0.069..3.221 rows=1985 loops=1) -> Index Scan using messages_memo_pk on messages_memo (cost=0.00..5.91 rows=1 width=4) (actual time=0.006..0.008 rows=1 loops=13)
               Index Cond: (messages_memo.message_id = messages.id)
               Filter: ((messages_memo.state)::integer = 1)
 Total runtime: 6.628 ms
(15 rows)


No sequential scan. So I 'changed' my function so that first row says 'SET enable_seqscan TO false'. After that, here are the times for the function call:

mike@som:~$ psql -U postgres pulitzer2
Welcome to psql 8.3.5, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
       \h for help with SQL commands
       \? for help with psql commands
       \g or terminate with semicolon to execute query
       \q to quit
pulitzer2=# explain analyze select get_memo_display_queue_size(1829);
QUERY PLAN
----------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=300.448..300.450 rows=1 loops=1)
 Total runtime: 300.491 ms
(2 rows)

pulitzer2=# explain analyze select get_memo_display_queue_size(1829);
QUERY PLAN
------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=1.940..1.941 rows=1 loops=1)
 Total runtime: 1.961 ms
(2 rows)

pulitzer2=# explain analyze select get_memo_display_queue_size(1829);
QUERY PLAN
------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=1.946..1.947 rows=1 loops=1)
 Total runtime: 1.973 ms
(2 rows)

pulitzer2=# explain analyze select get_memo_display_queue_size(1829);
QUERY PLAN
------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=1.936..1.937 rows=1 loops=1)
 Total runtime: 1.964 ms
(2 rows)

pulitzer2=#

So, first query on the same connection takes 300ms, and any subsequential query on the same connection takes less than 2 ms. If I remove 'SET enable_seqscan TO false' from the top of the function, every call to the function takes around 200-300ms.


Now, as I was explained on pg-jdbc mailinglist, that 'SET enable_seqscan TO false' affects all queries on that persistent connection from tomcat, and It's not good solution. So I wanted to post here to ask what other options do I have.

While writing this I realized that, without forcing sequential scan out, I get much quicker execution times when I do:

SELECT count(*) FROM _v1 WHERE service_id = 1829

then when I do

SELECT get_memo_display_queue_size(1829),

as seen here:



mike@som:~$ psql -U postgres pulitzer2
Welcome to psql 8.3.5, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
       \h for help with SQL commands
       \? for help with psql commands
       \g or terminate with semicolon to execute query
       \q to quit

pulitzer2=# explain analyze select get_memo_display_queue_size(1829);
QUERY PLAN
----------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=259.090..259.092 rows=1 loops=1)
 Total runtime: 259.132 ms
(2 rows)

pulitzer2=# EXPLAIN ANALYZE select count(*) from _v1 WHERE service_id = 1829;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=23517.98..23517.99 rows=1 width=0) (actual time=5.942..5.943 rows=1 loops=1) -> Nested Loop (cost=150.70..23517.44 rows=216 width=0) (actual time=5.674..5.909 rows=13 loops=1) -> Hash Left Join (cost=150.70..11037.87 rows=2105 width=4) (actual time=5.633..5.706 rows=13 loops=1) Hash Cond: (messages.id = messages_memo_displayed.message_id)
               Filter: (messages_memo_displayed.admin_id IS NULL)
-> Bitmap Heap Scan on messages (cost=97.04..10957.81 rows=4210 width=4) (actual time=0.032..0.063 rows=13 loops=1)
                     Recheck Cond: (service_id = 1829)
-> Bitmap Index Scan on messages_uq__service_id__tan (cost=0.00..95.98 rows=4210 width=0) (actual time=0.022..0.022 rows=13 loops=1)
                           Index Cond: (service_id = 1829)
-> Hash (cost=28.85..28.85 rows=1985 width=8) (actual time=5.588..5.588 rows=1985 loops=1) -> Seq Scan on messages_memo_displayed (cost=0.00..28.85 rows=1985 width=8) (actual time=0.009..2.690 rows=1985 loops=1) -> Index Scan using messages_memo_pk on messages_memo (cost=0.00..5.92 rows=1 width=4) (actual time=0.008..0.010 rows=1 loops=13)
               Index Cond: (messages_memo.message_id = messages.id)
               Filter: ((messages_memo.state)::integer = 1)
 Total runtime: 6.026 ms
(15 rows)

pulitzer2=# explain analyze select get_memo_display_queue_size(1829);
QUERY PLAN
----------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=211.712..211.714 rows=1 loops=1)
 Total runtime: 211.742 ms
(2 rows)

pulitzer2=# EXPLAIN ANALYZE select count(*) from _v1 WHERE service_id = 1829;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=23517.98..23517.99 rows=1 width=0) (actual time=5.918..5.920 rows=1 loops=1) -> Nested Loop (cost=150.70..23517.44 rows=216 width=0) (actual time=5.659..5.885 rows=13 loops=1) -> Hash Left Join (cost=150.70..11037.87 rows=2105 width=4) (actual time=5.638..5.711 rows=13 loops=1) Hash Cond: (messages.id = messages_memo_displayed.message_id)
               Filter: (messages_memo_displayed.admin_id IS NULL)
-> Bitmap Heap Scan on messages (cost=97.04..10957.81 rows=4210 width=4) (actual time=0.043..0.078 rows=13 loops=1)
                     Recheck Cond: (service_id = 1829)
-> Bitmap Index Scan on messages_uq__service_id__tan (cost=0.00..95.98 rows=4210 width=0) (actual time=0.033..0.033 rows=13 loops=1)
                           Index Cond: (service_id = 1829)
-> Hash (cost=28.85..28.85 rows=1985 width=8) (actual time=5.581..5.581 rows=1985 loops=1) -> Seq Scan on messages_memo_displayed (cost=0.00..28.85 rows=1985 width=8) (actual time=0.009..2.678 rows=1985 loops=1) -> Index Scan using messages_memo_pk on messages_memo (cost=0.00..5.92 rows=1 width=4) (actual time=0.006..0.008 rows=1 loops=13)
               Index Cond: (messages_memo.message_id = messages.id)
               Filter: ((messages_memo.state)::integer = 1)
 Total runtime: 5.994 ms
(15 rows)

pulitzer2=#



Now I'm confused, why is 'sql' function much slower than 'direct' SELECT?

	Mike

P.S. Here are tables definition, from psql:

pulitzer2=# \d messages
                                               Table "public.messages"
Column | Type | Modifiers
--------------------+--------------------------+---------------------------------------------------------------------
id | integer | not null default nextval(('public.message_id_seq'::text)::regclass)
 from               | character varying(15)    | not null
 to                 | character varying(10)    | not null
 receiving_time     | timestamp with time zone | not null default now()
 raw_text           | character varying        | not null
 keyword            | character varying        |
 destination_id     | integer                  | not null
 vpn_id             | integer                  |
 service_id         | integer                  |
 status             | integer                  | not null default 2
 gateway_message_id | character varying        | not null
 prize_id           | integer                  |
 tan                | character varying        |
Indexes:
    "messages_pk" PRIMARY KEY, btree (id)
    "messages_uq__gateway_message_id" UNIQUE, btree (gateway_message_id)
    "messages_uq__service_id__tan" UNIQUE, btree (service_id, tan)
    "messages_ix_from" btree ("from")
"messages_ix_receiving_time__service_id__status" btree (receiving_time, service_id, status)
    "messages_ix_vpn_id" btree (vpn_id)
Foreign-key constraints:
"messages_fk__destinations_id" FOREIGN KEY (destination_id) REFERENCES destinations(id) "messages_fk__service_prizes_prize_id" FOREIGN KEY (prize_id) REFERENCES service_prizes(prize_id) "messages_fk__services_id" FOREIGN KEY (service_id) REFERENCES services(id)
    "messages_fk__vpns_id" FOREIGN KEY (vpn_id) REFERENCES vpns(id)


pulitzer2=# \d messages_memo
                 Table "public.messages_memo"
         Column         |           Type           | Modifiers
------------------------+--------------------------+-----------
 message_id             | integer                  | not null
 memo                   | character varying        |
 state                  | dom_messages_memo_state  | not null
 admin_id               | integer                  |
 admin_change_timestamp | timestamp with time zone |
Indexes:
    "messages_memo_pk" PRIMARY KEY, btree (message_id)
Foreign-key constraints:
"messages_memo_fk__messages_id" FOREIGN KEY (message_id) REFERENCES messages(id)

pulitzer2=# \d messages_memo_displayed
Table "public.messages_memo_displayed"
   Column   |  Type   | Modifiers
------------+---------+-----------
 message_id | integer | not null
 admin_id   | integer | not null
Indexes:
    "messages_memo_displayed_pk" PRIMARY KEY, btree (message_id, admin_id)
Foreign-key constraints:
"messages_memo_displayed_fk__admins_id" FOREIGN KEY (admin_id) REFERENCES admins(id) "messages_memo_displayed_fk__messages_id" FOREIGN KEY (message_id) REFERENCES messages(id)

pulitzer2=# \d v_messages_memo
               View "public.v_messages_memo"
       Column       |           Type           | Modifiers
--------------------+--------------------------+-----------
 id                 | integer                  |
 from               | character varying(15)    |
 to                 | character varying(10)    |
 receiving_time     | timestamp with time zone |
 raw_text           | character varying        |
 keyword            | character varying        |
 destination_id     | integer                  |
 vpn_id             | integer                  |
 service_id         | integer                  |
 status             | integer                  |
 gateway_message_id | character varying        |
 prize_id           | integer                  |
 tan                | character varying        |
 memo               | character varying        |
 state              | dom_messages_memo_state  |
 displayed          | boolean                  |
View definition:
SELECT v_messages_full.id, v_messages_full."from", v_messages_full."to", v_messages_full.receiving_time, v_messages_full.raw_text, v_messages_full.keyword, v_messages_full.destination_id, v_messages_full.vpn_id, v_messages_full.service_id, v_messages_full.status, v_messages_full.gateway_message_id, v_messages_full.prize_id, v_messages_full.tan, messages_memo.memo, messages_memo.state, NULL::boolean AS displayed
   FROM messages v_messages_full
   JOIN messages_memo ON v_messages_full.id = messages_memo.message_id;

pulitzer2=#



--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux