Tom Lane wrote:
Anyway I think the main practical problem would be with deadlocks against other transactions trying to update/delete tuples at the same times you need to move them. Dealing with uncommitted insertions would be tricky too --- I think you'd need to wait out the inserting transaction, which would add more possibilities of deadlock.
I really appreciate your taking the time to think about and explain this. It's very helpful, as I'm trying to understand some of the basics of PostgreSQL's underlying operation.
I'd completely missed thinking about uncomitted inserts - I never normally need to think about them so they just didn't cross my mind. I guess it'd either have to do the equivalent of a SELECT FOR UPDATE NOWAIT on all tuples in the pages to be freed before doing anything else, or would have to take out an EXCLUSIVE table lock while freeing a chunk of pages.
I can also vaguely see how problems would arise with concurrent multi-tuple updates grabbing locks in a different order to the progressive cluster and deadlocking, and again hadn't even thought about that.
I guess it might be OK if the progressive cluster attempted to get row exclusive locks on all tuples in the contiguous range of pages to be freed, and if it failed to get even one it released them all and retried that whole step. It sounds like it could be slow and inefficient, though, possibly so much so as to defeat the point of the clustering operation in the first place.
Thanks again for taking the time to go over that - it's extremely helpful and much appreciated.
-- Craig Ringer