Is there a known bug with SKIP LOCKED and "tuple to be locked was already moved to another partition due to concurrent update"?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

long time ago I devised with your help a task queuing system which uses SELECT ... FOR UPDATE SKIP LOCKED for many parallel workers to find tasks in the queue, and it used a partitioned table where the hot part of the queue is short and so the query for a job is quick and the skip locked locking makes sure that one job is only assigned to one worker. And this works pretty well for me, except that when we run many workers we find a lot of these failures occurring:

"tuple to be locked was already moved to another partition due to concurrent update"

This would not exactly look like a bug, because the message says "to be locked", so at least it's not allowing two workers to lock the same tuple. But it seems that the skip-locked mode should not make an error out of this, but treat it as the tuple was already locked. Why would it want to lock the tuple (representing the job) if another worker has already finished his UPDATE of the job to mark it as "done" (which is what makes the tuple move to the "completed" partition.)

Either the SELECT for jobs to do returned a wrong tuple, which was already update, or there is some lapse in the locking.

Either way it would seem to be a waste of time throwing all these errors when the tuple should not even have been selected for update and locking.

I wonder if anybody knows anything about that issue? Of course you'll want to see the DDL and SQL queries, etc. but you can't really try it out unless you do some massively parallel magic. So I figured I just ask.

regards,
-Gunther






[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux