Hi all, especially Jim Jarvie, I saw your email to me only now on my
related issue. My issue remains this one:
Well this got interesting - the already moved error showed up:
and I have already gone through all those index pruning and all that
good stuff.
I remain with my original question from 30th of June, to me it feels
like a bug of some sort:
"tuple to be locked was already moved to another partition due to
concurrent update"
This would not exactly look like a bug, because the message says "to
be locked", so at least it's not allowing two workers to lock the same
tuple. But it seems that the skip-locked mode should not make an error
out of this, but treat it as the tuple was already locked. Why would
it want to lock the tuple (representing the job) if another worker has
already finished his UPDATE of the job to mark it as "done" (which is
what makes the tuple move to the "completed" partition.)
Either the SELECT for jobs to do returned a wrong tuple, which was
already updated, or there is some lapse in the locking.
Either way it would seem to be a waste of time throwing all these
errors when the tuple should not even have been selected for update
and locking.
I wonder if anybody knows anything about that issue? Of course you'll
want to see the DDL and SQL queries, etc. but you can't really try it
out unless you do some massively parallel magic.
I still think that it should simply not happen. Don't waste time on old
tuples trying to fetch and lock something that's no longer there. It's a
waste of resources.
regards,
-Gunther
On 8/20/2020 6:39 PM, Jim Jarvie wrote:
On 20-Aug.-2020 17:42, Michael Lewis wrote:
Can you share an explain analyze for the query that does the select for
update? I wouldn't assume that partition pruning is possible at all with
hash, and it would be interesting to see how it is finding those rows.
Well this got interesting - the already moved error showed up: Note,
the actual process partitions are regular table partitions, these are
not hashed. Only the incoming and completed are hashed due to row
counts at either end of the processing; in flight (where the issue
shows up) is quite small:
[queuedb] # explain analyze select queueid,txobject,objectid,state
from mq.queue where (state = 'tx_active' or state='tx_fail_retryable')
and txobject = 'ticket' limit 250 for update skip locked;
ERROR: 40001: tuple to be locked was already moved to another
partition due to concurrent update
LOCATION: heapam_tuple_lock, heapam_handler.c:405
Time: 579.131 ms
[queuedb] # explain analyze select queueid,txobject,objectid,state
from mq.queue where (state = 'tx_active' or state='tx_fail_retryable')
and txobject = 'ticket' limit 250 for update skip locked;
ERROR: 40001: tuple to be locked was already moved to another
partition due to concurrent update
LOCATION: heapam_tuple_lock, heapam_handler.c:405
Time: 568.008 ms
[queuedb] # explain analyze select queueid,txobject,objectid,state
from mq.queue where (state = 'tx_active' or state='tx_fail_retryable')
and txobject = 'ticket' limit 250 for update skip locked;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..25.71 rows=250 width=34) (actual
time=1306.041..1306.338 rows=250 loops=1)
-> LockRows (cost=0.00..7934.38 rows=77150 width=34) (actual
time=1306.040..1306.315 rows=250 loops=1)
-> Append (cost=0.00..7162.88 rows=77150 width=34) (actual
time=520.685..1148.347 rows=31500 loops=1)
-> Seq Scan on queue_tx_active (cost=0.00..6764.50
rows=77000 width=34) (actual time=520.683..1145.258 rows=31500 loops=1)
Filter: ((txobject = 'ticket'::mq.queue_object)
AND ((state = 'tx_active'::mq.tx_state) OR (state =
'tx_fail_retryable'::mq.tx_state)))
-> Seq Scan on queue_tx_fail_retryable
(cost=0.00..12.62 rows=150 width=34) (never executed)
Filter: ((txobject = 'ticket'::mq.queue_object)
AND ((state = 'tx_active'::mq.tx_state) OR (state =
'tx_fail_retryable'::mq.tx_state)))
Planning Time: 0.274 ms
Execution Time: 1306.380 ms
(9 rows)
Time: 1317.150 ms (00:01.317)
[queuedb] #