On Wed, Nov 4, 2009 at 9:41 AM, Brian Karlak <zenkat@xxxxxxxxxxx> wrote: > > I do a select for update in a stored proc: > > FOR queue_item IN > > SELECT * FROM queue > WHERE status IS NULL AND id >= low_bound_id > ORDER BY id LIMIT batch_size > FOR UPDATE > > LOOP > > UPDATE queue_proc set status = 'proc' where id = queue_item.id ; > > The daemons keep track of their last position in the queue with > low_bound_id. Also, as you probably notice, I also fetch a batch of (100) > items at a time. In practice, it's pretty fast. The job I'm running now is > showing an average fetch time of 30ms per 100 actions, which ain't bad. > > However, the writing of results back to the row takes ~5ms, which is slower > than I'd like. 5 ms per each of the 100 actions? With one commit per action? > > It seems you have an select, and update, and another update. Where in > > this process do you commit? Are you using fsync=off or > > synchronous_commit=off? > > First commit occurs after the stored proc to select/update a batch of items > is complete. So one commit per 100 items? > Second commit occurs on the writing of results back for each > particular action. So one commit per 1 item? If so, this completely explains the difference in speed, I think. > Two commits are required because the time it takes to > complete the intervening action can vary wildly: anywhere between 20ms and > 45min. Is there any way of knowing/approximating ahead of time how long it will take? The 45 min monsters must be exceedingly rare, or else the average could not be ~50ms. >> Why would the index scan take 1 ms two of the times it is done but 5ms >> the third time? Isn't it the same index scan each time? Or does the >> change in queue.status change the plan? > > The final update is a different query -- just a plain old update by ID: > > UPDATE queue_proc set status = 'proc' where id = %s ; That looks very much like the other UPDATE you showed. The difference it seems is that you commit after every one, rather than after every 100. Right? > This update by ID takes ~2.5ms, which means it's where the framework is > spending most of its overhead. You said the computation task can take anywhere from 20ms to 45min, so it seems that this update overhead is at most 1/8 of the irreducible time. That doesn't seem like it is enough to worry about, to me. Jeff -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance