Search Postgresql Archives

Re: Select for update with offset interferes with concurrent transactions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/1/2011 6:32 AM, Yngve Nysaeter Pettersen wrote:
Hello all,

I am in the process of migrating a system from Postgresql 8.3 to 9.0,
and have run into a problem with the task queue systems I am using.

The task queue controls the allocation of tasks between about 1000
processes working in parallel, and is essentially a table of

record_id (unique)
project_id
task_description_id
state (idle, started, finished)

Each project currently have about 2 million entries. My plan is to
increase that significantly the next few months.

To avoid having the processes trample each other's queries (the first
attempt was to select the first matching entries of the table, which
caused one to block all other transactions), one of the steps I took was
to select a set of idle rows at a random offset into the table from the
project, mark them for update, then update each record's state as started.

SELECT record_id FROM queue WHERE project_id = my_project AND state =
idle LIMIT n OFFSET i FOR UPDATE

At present "n" is 100-150, "i" is a random value in the range 0-10000.

There is, intentionally, no ordering specified, since that would just
slow down the query, and is not necessary.

For reference, the above query is sent through Django's cursor.execute()
call in a manual transaction block.



What I've discovered when using Postgres 9.0 is that the processes are
now blocking every other query into this table, apparently reducing the
task processing speed by at least a factor of 10, and increasing the
load on the server by a similar factor, compared to when Postgres 8.3
was used. The problem is apparent just after starting, with only 50-100
processes active (startup is staggered).

Reducing "n" (and looping), or increasing the "i" range did not work.


The reason seems to be this new part of
http://www.postgresql.org/docs/9.0/static/sql-select.html (towards the
end of the FOR UPDATE section):

If a LIMIT is used, locking stops once enough rows have been returned to
satisfy the limit
(but note that rows skipped over by OFFSET will get locked). Similarly,
if FOR UPDATE or
FOR SHARE is used in a cursor's query, only rows actually fetched or
stepped past by the
cursor will be locked.

I can't find similar text in the 8.3 or 8.4 documentation.

AFAICT, and assuming I have not misunderstood this part of the
documentation this means that if one of my processing nodes selects a
block of 100 entries at offset 8000 in the resulting table, then every
other node will be blocked while the block is being processed, not just
the nodes that would have selected the rows in the range 0 to 7999, but
also >=8100, because they cannot gain access to the rows.

Also, using FOR SHARE does not seem to solve the problem.

IMO, as a database non-expert, locking rows that were not returned as a
result of the query is a bug. As an example, if a query selects the X
last items in the matching rows, that is equivalent to locking the
table, or the relevant part of it, even if the requester have no
intention to modify those other rows.


Is there any way to avoid this problem? Or do I have to add a random
batch_id field to the queue table in order to separate the processes'
queries so that they do not block each other (as frequently)?

Is it possible to disable the source code causing this (that is,
reverting the patch that introduced the problem, or changing a
configuration switch)?



So, if I understand correctly, you:

q = SELECT record_id FROM queue
	WHERE project_id = my_project AND state = idle
	LIMIT n OFFSET i FOR UPDATE
while not q.eof
	update queue set state = started where record_id = x;
	process record_id
	update queue set state = finsihed where record_id = x;
	q.next;


Might I suggest and alternative:

q = update queue set state = started
	WHERE project_id = my_project AND state = idle
	LIMIT n OFFSET i
	RETURNING project_id;
idlist = @q;
commit;

foreach x in idlist
	process record_id
	begin
	update queue set state = finsihed where record_id = x;
	commit;



Forgive the part perl part python sudocode. Oh, and I've never done this, no idea if it actually works. :-)

-Andy

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux