Search Postgresql Archives

Re: delete statement returning too many results

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On 29. Nov 2022, at 18:35, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
> 
> Harmen <harmen@xxxxxxxxx> writes:
>> On Mon, Nov 28, 2022 at 12:11:53PM -0500, Tom Lane wrote:
>>> So basically it's unsafe to run the sub-select more than once,
>>> but the query as written leaves it up to the planner whether
>>> to do that.  I'd suggest rephrasing as [...]
> 
>> I'm not the original poster, but I do use similar constructions for simple
>> postgres queues. I've been trying for a while, but I don't understand where the
>> extra rows come from, or what's "silent" about SKIP LOCKED.
> 
> Sorry, I should not have blamed SKIP LOCKED in particular; this
> construction will misbehave with or without that.  The issue is with
> using SELECT FOR UPDATE inside a DELETE or UPDATE that then modifies
> the row that the subquery returned.  The next execution of the subquery
> will, or should, return a different row: either some not-deleted row,
> or the modified row.  So in this context, the result of the subquery
> is volatile.  The point of putting it in a MATERIALIZED CTE is to
> lock the result down regardless of that.
> 
>> Because we get different results depending on the plan postgres picks, I can
>> see two options: either the query is broken, or postgres is broken.
> 
> You can argue that the planner should treat volatile subqueries
> differently than it does today.  But the only reasonable way of
> tightening the semantics would be to force re-execution of such a
> subquery every time, even when it's not visibly dependent on the
> outer query.  That would be pretty bad for performance, and I doubt
> it would make the OP happy in this example, because what it would
> mean is that his query "fails" every time not just sometimes.
> (Because of that, I don't have too much trouble concluding that
> the query is broken, whether or not you feel that postgres is
> also broken.)
> 
> The bigger picture here is that we long ago decided that the planner
> should not inquire too closely into the volatility of subqueries,
> primarily because there are use-cases where people intentionally rely
> on them not to be re-executed.  As an example, these queries give
> different results:
> 
> regression=# select random() from generate_series(1,3);
>       random        
> ---------------------
>  0.7637195395988317
> 0.09569374432524946
>   0.490132093120365
> (3 rows)
> 
> regression=# select (select random()) from generate_series(1,3);
>       random       
> --------------------
> 0.9730230633436501
> 0.9730230633436501
> 0.9730230633436501
> (3 rows)
> 
> In the second case, the sub-select is deemed to be independent
> of the outer query and executed only once.  You can argue that
> if that's what you want you should be forced to put the sub-select
> in a materialized CTE to make that plain.  But we felt that that
> would make many more people unhappy than happy, so we haven't
> done it.  Maybe the question could be revisited once all PG
> versions lacking the MATERIALIZED syntax are long dead. 
> 
> regards, tom lane

Thanks for the thorough explanation. That seems very reasonable.

The CTE query works well for my use case.

Thanks!







[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux