Hi,
On Sat, 04 Jan 2014 16:23:42 +0100, Andrew Sullivan <ajs@xxxxxxxxxxxxxxx>
wrote:
On Sat, Jan 04, 2014 at 12:14:42PM +0100, Yngve N. Pettersen wrote:
The update query looks like this:
UPDATE queue SET state = E'S' WHERE state = E'I' AND id IN (<list
of integers>) RETURNING id;
There is a BEGIN/COMMIT wrap around the operation, including the
SELECT query.
Am I right that you're doing the SELECT, pulling that up into the
application, and then that becomes your <list of integers>? If so, my
first advice is, "Don't do that," because you're making the
transaction run longer (which is probably part of the problem).
Instead, you probably want something like
UPDATE . . . AND id IN (SELECT . . .LIMIT. . .) RETURNING id;
This won't solve your problem perfectly, though.
I tried that before, but ran into some issues, IIRC a similar looping
problem as this where queries never ended. I split it up in an attempt to
solve that problem.
My guess is that the active processes get into a lock/unlock loop
regarding the "state" field because the list of ids overlap, and for
some reason, instead of completing the operation according to some
kind of priority order.
My guess is that you have a "lock inversion", yes, but it's hard to
guess what. You want to look at the pg_locks view to figure what's
blocking what. It seems likely that your case is not strictly a
deadlock. A deadlock is a case where transaction A has a lock on
something that transaction B needs, and needs to wait for a lock on an
object that is locked by transaction B. Neither one could possibly
complete, and you get a deadlock detection.
That is why I called it a (possible) multiprocess lock/unlock loop, not a
deadlock.
It looks to me like one process starts looking at a task, then realizes
that this can be or are being handled by another process, and passes the
baton, and that this happens so quickly that the processes can keep 100%
CPU
There's an additional possibility that is suggested by your
description, and that is that it's not locks at all, but that you're
running into some kind of system issue. Did you adjust the setting of
sort_mem? It's a long shot, but it could be that if there are enough
sorts in the SELECT (which you don't show us), you're chewing through
In the select/update case there is no sorting in the query; there is an
offset/limit clause though, number of records retrieved are currently
restricted to <10000 per query (out of 20 million in the active subset).
SELECT id from queue where state = E'I' and job_id = <integer> offset
<random 200..150000> limit <1-6000>
The offset is to avoid having the processes thread on each others toes and
avoid selecting the same records, in case the order is predictable
There are indexes on on several fields including conditionals on some
state values in combination with job_id
As I recall, there is no sorting in the SELECT case either, although there
is some filtering on some of the tables to restrict what is being returned.
a lot of sort_mem. Remember that sort_mem is allocated _per sort_, so
it could be that a lot of these allocations fit fine in real memory if
only 8 processes are doing it; but if 10 do, you pass your threshold
for physical memory and start swapping. I wouldn't expect high CPU
under that case, though, but high I/O. So I think it's doubtful.
However, in the UPDATE case, the looping processes are all UPDATE queries,
no SELECTs involved.
There is no swap on the system (I don't want swapping, precisely because
it then slows down the system), but there is lots of RAM, with 180+GB
available.
I don't have the config file handy at the moment, but as I recall, while I
did adjust the sort_mem variable it was not high enough (IIRC <=512MB) to
consume that much memory in 10 processes even with a lot of sorts. I have
not seen any peaks in application memory during the loop events as I
recall.
Another thing that IMO count against this is that for the UPDATE case, the
system seems to be quite able to handle 50-60 such queries at a time
(inside 10-30 seconds), except the first time the queue is accessed after
the job list has been initialized (and after ANALYZE have been run on the
table, and the table is analyzed 6+ times an hour).
--
Using Opera's mail client: http://www.opera.com/mail/
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general