Search Postgresql Archives

Re: dubious optimization of the function in SELECT INTO target list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/06/2015 01:13 PM, Oleksii Kliukin wrote:
Hello,

I have an issue with a function that is called as a part of the SELECT
INTO target list in pl/pgSQL. I'd like to illustrate it with a simple
example:

DO $$
DECLARE l_id integer;
     BEGIN
        SELECT test(id) INTO l_id
         FROM generate_series(1,10) t(id);
     END;
$$ LANGUAGE plpgsql;

It looks like the test function in this example is executed only once.
In order to check this, one can define the test function as following:

CREATE TABLE foo(id integer);

CREATE OR REPLACE FUNCTION public.test(id integer)
RETURNS integer
LANGUAGE plpgsql
AS
$fn$
     BEGIN
         INSERT INTO foo VALUES($1);
         RETURN $1;
     END;
$fn$

Basically, if we invoke the first example, the foo table with have only
1 row and not 10, as supplied by the generate_series.
However, when ORDER BY is attached to the query, or aggregate (such as
max, min or array_agg) is wrapped around the test(id) call, the test
function is called exactly 10 times. If I replace the SELECT INTO with
PERFORM, it would also be called 10 times. Unfortunately, it is not
possible to use PERFORM directly in the CTE expression.

What CTE expression?


The documentation on the SELECT INTO suggests that the rows returned by
the test function may be discarded after the first one:

"If STRICT is not specified in the INTO clause, then target will be set
to the first row returned by the query, or to nulls if the query
returned no rows. (Note that "the first row" is not well-defined unless
you've used ORDER BY.) Any result rows after the first row are
discarded."
http://www.postgresql.org/docs/current/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-SQL-ONEROW

However, it does not say anything about the number of rows the query
target list will be evaluated, meaning one may expect it to be evaluated
more than once. It seems that in the case of the example above
optimizing out calls to the 'test' function would only produce an
expected result if the function itself does not have any side-effects,
e.g.. qualifies as 'stable' or 'immutable'.

How about:

DO $$
DECLARE l_id integer;
    BEGIN
       FOR l_id IN SELECT id
        FROM generate_series(1,10) as id LOOP
            SELECT INTO l_id test(l_id);
       END LOOP;
    END;
$$ LANGUAGE plpgsql;


Is there some (undocumented) restriction on the functions allowed in the
SELECT target list, and isn't the optimization to limit the number of
calls to 'test' to 1 wrong in this case?

Kind regards,



--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux