Re: Bug? Function with side effects not evaluated in CTE

David Johnston <polobo@xxxxxxxxx> · Mon, 21 Oct 2013 12:49:16 -0700 (PDT)

Tom Lane-2 wrote
> Moshe Jacobson &lt;

> moshe@

> &gt; writes:
>> I am of the belief that if the function in a CTE is volatile, that it
>> should be executed unconditionally.
> 
> [ shrug... ]  Consider
> 
> 	SELECT volatile_function(i) FROM generate_series(1, 10) i LIMIT 1;
> 
> How many times should the volatile function get executed?  If your answer
> is not "10", how is this different from the CTE case?  This LIMIT clause
> is restricting the number of times the function executes in pretty much
> the same way that our definition of CTE evaluation does, AFAICS.
> 
> You could of course argue that our definition of LIMIT is wrong too,
> but that's going to raise the bar for convincing people even higher,
> because of the number of existing applications that such a redefinition
> would break.

The CTE would functionally replace the generate_series() call as opposed to
the select-list evaluation.  Since the CTE establishes an optimization
boundary the parts of the query below the main (pulling) FROM clause should
not (or need not) influence the evaluation of the CTE.

The two comparable queries are:

A) WITH vf ( SELECT volatile_function(x) FROM generate_series(1,10) )
SELECT * FROM vf LIMIT 1

B) SELECT volatile_function(x) FROM generate_series(1,10) gs (x) LIMIT 1

In (A) the relation "vf" - which is a 10-row table with the result of
volatile_function as the only column - is limited to a single record and
that whole row is output as-is (because of the "*")

In (B) the relation "gs" - which is 10 rows having the result of
generate_series as the only column - is limited to a single row and then the
select-list project occurs against that single row (the volatile_function)

This is my naive, not technically informed, opinion of how these two
constructs differ.

This makes the optimization boundary characteristic of CTEs much stronger so
making such a boundary dependent upon whether the CTE contains any
volatile_functions seems desirable.  That way if "volatile_function" is
instead made stable then its evaluation 10-times can be avoided; though in
this case that would depend on whether pushing down the LIMIT is even valid.

As commented by Moshe the "number of times" is less and issue than "yes/no"
determination but I guess that any simple implementation would have to
handle both cases identically so its impossible to ignore the implication on
the "number of times" queries in solving the "yes/no" problem.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Bug-Function-with-side-effects-not-evaluated-in-CTE-tp5774792p5775321.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general