> On 29. Nov 2022, at 18:35, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > > Harmen <harmen@xxxxxxxxx> writes: >> On Mon, Nov 28, 2022 at 12:11:53PM -0500, Tom Lane wrote: >>> So basically it's unsafe to run the sub-select more than once, >>> but the query as written leaves it up to the planner whether >>> to do that. I'd suggest rephrasing as [...] > >> I'm not the original poster, but I do use similar constructions for simple >> postgres queues. I've been trying for a while, but I don't understand where the >> extra rows come from, or what's "silent" about SKIP LOCKED. > > Sorry, I should not have blamed SKIP LOCKED in particular; this > construction will misbehave with or without that. The issue is with > using SELECT FOR UPDATE inside a DELETE or UPDATE that then modifies > the row that the subquery returned. The next execution of the subquery > will, or should, return a different row: either some not-deleted row, > or the modified row. So in this context, the result of the subquery > is volatile. The point of putting it in a MATERIALIZED CTE is to > lock the result down regardless of that. > >> Because we get different results depending on the plan postgres picks, I can >> see two options: either the query is broken, or postgres is broken. > > You can argue that the planner should treat volatile subqueries > differently than it does today. But the only reasonable way of > tightening the semantics would be to force re-execution of such a > subquery every time, even when it's not visibly dependent on the > outer query. That would be pretty bad for performance, and I doubt > it would make the OP happy in this example, because what it would > mean is that his query "fails" every time not just sometimes. > (Because of that, I don't have too much trouble concluding that > the query is broken, whether or not you feel that postgres is > also broken.) > > The bigger picture here is that we long ago decided that the planner > should not inquire too closely into the volatility of subqueries, > primarily because there are use-cases where people intentionally rely > on them not to be re-executed. As an example, these queries give > different results: > > regression=# select random() from generate_series(1,3); > random > --------------------- > 0.7637195395988317 > 0.09569374432524946 > 0.490132093120365 > (3 rows) > > regression=# select (select random()) from generate_series(1,3); > random > -------------------- > 0.9730230633436501 > 0.9730230633436501 > 0.9730230633436501 > (3 rows) > > In the second case, the sub-select is deemed to be independent > of the outer query and executed only once. You can argue that > if that's what you want you should be forced to put the sub-select > in a materialized CTE to make that plain. But we felt that that > would make many more people unhappy than happy, so we haven't > done it. Maybe the question could be revisited once all PG > versions lacking the MATERIALIZED syntax are long dead. > > regards, tom lane Thanks for the thorough explanation. That seems very reasonable. The CTE query works well for my use case. Thanks!