On Mon, Jan 9, 2012 at 2:58 PM, Chris Angelico <rosuav@xxxxxxxxx> wrote: > In this project, we need to have a disk-based priority queue, with a > few snazzy features (eg limited retries and scheduling) but nothing > particularly exotic. Requests are added to the queue by any number of > different processes, and it's not a problem if a request "misses the > check" and is delayed by a few seconds. Requests are processed by a > pool of workers which are unaware of each other (and may be running on > different computers); the sole arbiter is the database itself. Since posting this, I've researched advisory locks and ended up going with a quite different model. In case anyone else wants to build a system of this nature, here's what we're currently using. The system is optimized, at any given time, for a particular number of worker processes. For discussion purposes, let's say 8. (This number is stored in the database.) Each worker queries the pg_locks table to find an available slot, then claims it using pg_try_advisory_lock() - the query is to avoid spamming the lock function unnecessarily (a sequential scan would also work). The slots are numbered 0-7, and represent partitions of the available work, using the sequential ID modulo 8. This is the biggest difference. Instead of marking individual calls as 'claimed', then processing them, then de-claiming them at the end, the semaphoring is done on the entire partition and using advisory locks - cutting MVCC spam 50% in the process. This technique does have a vulnerability, however; if a worker process terminates (or the whole computer it's running on dies), one eighth of the work will simply never happen. This is dealt with out-of-band by having another process "keep an eye on" the pg_locks table, checking that there are always 8 advisory locks in the appropriate section. Performance is excellent, due in part to an index involving (id%8) - this index is rebuilt by script any time the partition count is changed, which is a significant administrative change involving a shutdown of all workers. To anyone who has need of unusual locking techniques, may I commend advisory locks to your notice. They're high performance and come with automatic unlocking on disconnect (which was an issue with the original "update set claimed=N" technique - some process would have to detect that a worker had gone, and reset the claimed marker for that row). An excellent feature. Chris Angelico -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general