Tom Lane wrote: > Mario Splivalo <mario.splivalo@xxxxxxxxxx> writes: >> Now I'm confused, why is 'sql' function much slower than 'direct' SELECT? > > Usually the reason for this is that the planner chooses a different plan > when it has knowledge of the particular value you are searching for than > when it does not. I suppose 'service_id' has a very skewed distribution > and you are looking for an uncommon value? I don't think so. Here is distribution for the messages_memo_displayed table (joined with messages, just to show how many messages of each service_id are there in messages_memo_displayed): pulitzer2=# select service_id, count(*) from messages join messages_memo_displayed on id = message_id group by service_id order by service_id; service_id | count ------------+------- 504 | 2 1790 | 1922 1814 | 1 1816 | 57 1818 | 3 (5 rows) And the sizes of other tables involved: pulitzer2=# select count(*) from messages_memo_displayed; count ------- 1985 (1 row) Time: 0.602 ms pulitzer2=# pulitzer2=# select count(*) from messages; count --------- 1096388 (1 row) Time: 345.267 ms pulitzer2=# select count(*) from messages_memo; count -------- 776238 (1 row) Time: 133.942 ms pulitzer2=# As I've mentioned earlier, I have created an view, for the sake of this posting: CREATE OR REPLACE VIEW _v1 AS SELECT messages.id, messages."from", messages."to", messages.receiving_time, messages.raw_text, messages.keyword, messages.destination_id, messages.vpn_id, messages.service_id, messages.status, messages.gateway_message_id, messages.prize_id, messages.tan, messages_memo.memo, messages_memo.state, messages_memo.displayed, messages_memo_displayed.admin_id FROM messages JOIN messages_memo ON messages.id = messages_memo.message_id LEFT JOIN messages_memo_displayed ON messages.id = messages_memo_displayed.message_id WHERE messages_memo.state::integer = 1 AND messages_memo_displayed.admin_id IS NULL; And then I created a function: CREATE OR REPLACE FUNCTION __new__get_memo_display_queue_size(a_service_id integer) RETURNS integer AS $BODY$ SELECT COUNT(*)::int4 FROM _v1 WHERE service_id = $1 $BODY$ LANGUAGE 'sql' VOLATILE SECURITY DEFINER; Now, here are the differences: pulitzer2=# select count(*) from _v1 where service_id = 504; count ------- 0 (1 row) Time: 6.101 ms pulitzer2=# select __new__get_memo_display_queue_size(504); __new__get_memo_display_queue_size ------------------------------------ 0 (1 row) Time: 322.555 ms pulitzer2=# select count(*) from _v1 where service_id = 1790; count ------- 1 (1 row) Time: 25.203 ms pulitzer2=# select __new__get_memo_display_queue_size(1790); __new__get_memo_display_queue_size ------------------------------------ 1 (1 row) Time: 225.763 ms pulitzer2=# select count(*) from _v1 where service_id = 1814; count ------- 2 (1 row) Time: 13.662 ms pulitzer2=# select __new__get_memo_display_queue_size(1814); __new__get_memo_display_queue_size ------------------------------------ 2 (1 row) Time: 215.251 ms pulitzer2=# select count(*) from _v1 where service_id = 1816; count ------- 1 (1 row) Time: 10.111 ms pulitzer2=# select __new__get_memo_display_queue_size(1816); __new__get_memo_display_queue_size ------------------------------------ 1 (1 row) Time: 220.457 ms pulitzer2=# select count(*) from _v1 where service_id = 1829; count ------- 13 (1 row) Time: 2.023 ms pulitzer2=# select __new__get_memo_display_queue_size(1829); __new__get_memo_display_queue_size ------------------------------------ 13 (1 row) Time: 221.956 ms pulitzer2=# Is this difference normal? I tend to have the interface between the database and the application trough functions, and I'd like not to include 'SELECT COUNT(*)...' in my Java code (at least, if I don't have to! - esp. because I'm not Java developer on the project). Then, this is also interesting, I think! I'm telling the planer never to use sequential scan: pulitzer2=# set enable_seqscan to false; SET Time: 0.150 ms pulitzer2=# select __new__get_memo_display_queue_size(1829); __new__get_memo_display_queue_size ------------------------------------ 13 (1 row) Time: 2.412 ms pulitzer2=# select count(*) from _v1 where service_id = 1829; count ------- 13 (1 row) Time: 2.092 ms pulitzer2=# select __new__get_memo_display_queue_size(1816); __new__get_memo_display_queue_size ------------------------------------ 1 (1 row) Time: 2.473 ms pulitzer2=# select count(*) from _v1 where service_id = 1816; count ------- 1 (1 row) Time: 2.117 ms pulitzer2=# Now the the execution times are almost the same. So, why this difference in the first place, and, what should I do to have satisfying results when calling a postgres function? I could rewrite the function from plain sql to plpgsql, and add 'SET enable_seqscan TO false' before getting the count, and add 'SET enable_seqscan TO true' after getting the count, but as I was explained on pg-jdbc mailinglist that is not the way to go. And I still don't understand why do I have excellent times when I force planner not to use sequential scan inside the function, but when 'calling' the query from plain sql (SELECT COUNT(*) FROM _v1 WHERE), execution time is always around 2-4ms, regardles of the value of enable_seqscan parametar. Mike -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance