I can't believe it.
I see some recommendations in Internet to do like this (e.g. https://stackoverflow.com/questions/5170546/how-do-i-delete-a-fixed-number-of-rows-with-sorting-in-postgresql).
Did it really work in 2011? Are you saying they broke it? It's a shame...
Anyway I think the problem is pretty clear: I want to eventually clear the table based on the predicate but I don't want to lock it for a long time.
The table does not have a primary key.
What should be a proper solution?
--
Vlad
пн, 17 дек. 2018 г. в 17:40, Tom Lane <tgl@xxxxxxxxxxxxx>:
Vladimir Ryabtsev <greatvovan@xxxxxxxxx> writes:
> I want to clean a large log table by chunks. I write such a query:
> delete from categorization.log
> where ctid in (
> select ctid from categorization.log
> where timestamp < now() - interval '2 month'
> limit 1000
> )
> Why does this query want to use Seq Scan and Sort on a 423M rows table?
There's no support for using ctid as a join key in this way; specifically,
nodeTidscan.c doesn't have support for being a parameterized inner scan,
nor does tidpath.c have code to generate such a plan. The header comments
for the latter say
* There is currently no special support for joins involving CTID; in
* particular nothing corresponding to best_inner_indexscan(). Since it's
* not very useful to store TIDs of one table in another table, there
* doesn't seem to be enough use-case to justify adding a lot of code
* for that.
Queries like yours are kinda sorta counterexamples to that, but pretty
much all the ones I've seen seem like crude hacks (and this one is not
an exception). Writing a bunch of code to support them feels like
solving the wrong problem. Admittedly, it's not clear to me what the
right problem to solve instead would be.
(It's possible that I'm overestimating the amount of new code that would
be needed to implement this, however. indxpath.c is pretty huge, but
that's mostly because there are so many cases to consider. There'd only
be one interesting case for an inner TID scan. Also, this comment is
ancient, predating the current approach with parameterized paths ---
in fact best_inner_indexscan doesn't exist as such anymore. So maybe
that old judgment that it'd take a lot of added code is wrong.)
regards, tom lane