Re: Query much slower when run from postgres function

Mario Splivalo <mario.splivalo@xxxxxxxxxx> · Mon, 09 Mar 2009 20:13:01 +0100

Tom Lane wrote:
> Mario Splivalo <mario.splivalo@xxxxxxxxxx> writes:
>> Now I'm confused, why is 'sql' function much slower than 'direct' SELECT?
> 
> Usually the reason for this is that the planner chooses a different plan
> when it has knowledge of the particular value you are searching for than
> when it does not.  I suppose 'service_id' has a very skewed distribution
> and you are looking for an uncommon value?

I don't think so. Here is distribution for the messages_memo_displayed
table (joined with messages, just to show how many messages of each
service_id are there in messages_memo_displayed):

pulitzer2=# select service_id, count(*) from messages join
messages_memo_displayed on id = message_id group by service_id order by
service_id;
 service_id | count
------------+-------
        504 |     2
       1790 |  1922
       1814 |     1
       1816 |    57
       1818 |     3
(5 rows)

And the sizes of other tables involved:

pulitzer2=# select count(*) from messages_memo_displayed;
 count
-------
  1985
(1 row)

Time: 0.602 ms
pulitzer2=#

pulitzer2=# select count(*) from messages;
  count
---------
 1096388
(1 row)

Time: 345.267 ms
pulitzer2=# select count(*) from messages_memo;
 count
--------
 776238
(1 row)

Time: 133.942 ms
pulitzer2=#

As I've mentioned earlier, I have created an view, for the sake of this
posting:

CREATE OR REPLACE VIEW _v1 AS
 SELECT messages.id, messages."from", messages."to",
messages.receiving_time, messages.raw_text, messages.keyword,
messages.destination_id, messages.vpn_id, messages.service_id,
messages.status, messages.gateway_message_id, messages.prize_id,
messages.tan, messages_memo.memo, messages_memo.state,
messages_memo.displayed, messages_memo_displayed.admin_id
   FROM messages
   JOIN messages_memo ON messages.id = messages_memo.message_id
   LEFT JOIN messages_memo_displayed ON messages.id =
messages_memo_displayed.message_id
  WHERE messages_memo.state::integer = 1 AND
messages_memo_displayed.admin_id IS NULL;

And then I created a function:

CREATE OR REPLACE FUNCTION
__new__get_memo_display_queue_size(a_service_id integer)
  RETURNS integer AS
$BODY$
SELECT
	COUNT(*)::int4
FROM
	_v1
WHERE
	service_id = $1
$BODY$
  LANGUAGE 'sql' VOLATILE SECURITY DEFINER;

Now, here are the differences:
pulitzer2=# select count(*) from _v1 where service_id = 504;
 count
-------
     0
(1 row)

Time: 6.101 ms
pulitzer2=# select __new__get_memo_display_queue_size(504);
 __new__get_memo_display_queue_size
------------------------------------
                                  0
(1 row)

Time: 322.555 ms
pulitzer2=# select count(*) from _v1 where service_id = 1790;
 count
-------
     1
(1 row)

Time: 25.203 ms
pulitzer2=# select __new__get_memo_display_queue_size(1790);
 __new__get_memo_display_queue_size
------------------------------------
                                  1
(1 row)

Time: 225.763 ms
pulitzer2=# select count(*) from _v1 where service_id = 1814;
 count
-------
     2
(1 row)

Time: 13.662 ms
pulitzer2=# select __new__get_memo_display_queue_size(1814);
 __new__get_memo_display_queue_size
------------------------------------
                                  2
(1 row)

Time: 215.251 ms
pulitzer2=# select count(*) from _v1 where service_id = 1816;
 count
-------
     1
(1 row)

Time: 10.111 ms
pulitzer2=# select __new__get_memo_display_queue_size(1816);
 __new__get_memo_display_queue_size
------------------------------------
                                  1
(1 row)

Time: 220.457 ms
pulitzer2=# select count(*) from _v1 where service_id = 1829;
 count
-------
    13
(1 row)

Time: 2.023 ms
pulitzer2=# select __new__get_memo_display_queue_size(1829);
 __new__get_memo_display_queue_size
------------------------------------
                                 13
(1 row)

Time: 221.956 ms
pulitzer2=#

Is this difference normal? I tend to have the interface between the
database and the application trough functions, and I'd like not to
include 'SELECT COUNT(*)...' in my Java code (at least, if I don't have
to! - esp. because I'm not Java developer on the project).

Then, this is also interesting, I think! I'm telling the planer never to
use sequential scan:

pulitzer2=# set enable_seqscan to false;
SET
Time: 0.150 ms
pulitzer2=# select __new__get_memo_display_queue_size(1829);
 __new__get_memo_display_queue_size
------------------------------------
                                 13
(1 row)

Time: 2.412 ms
pulitzer2=# select count(*) from _v1 where service_id = 1829;
 count
-------
    13
(1 row)

Time: 2.092 ms
pulitzer2=# select __new__get_memo_display_queue_size(1816);
 __new__get_memo_display_queue_size
------------------------------------
                                  1
(1 row)

Time: 2.473 ms
pulitzer2=# select count(*) from _v1 where service_id = 1816;
 count
-------
     1
(1 row)

Time: 2.117 ms
pulitzer2=#

Now the the execution times are almost the same.

So, why this difference in the first place, and, what should I do to
have satisfying results when calling a postgres function?
I could rewrite the function from plain sql to plpgsql, and add 'SET
enable_seqscan TO false' before getting the count, and add 'SET
enable_seqscan TO true' after getting the count, but as I was explained
on pg-jdbc mailinglist that is not the way to go.

And I still don't understand why do I have excellent times when I force
planner not to use sequential scan inside the function, but when
'calling' the query from plain sql (SELECT COUNT(*) FROM _v1 WHERE),
execution time is always around 2-4ms, regardles of the value of
enable_seqscan parametar.

	Mike

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance