On Mon, Nov 23, 2015 at 2:41 AM, Chris Withers <chris@xxxxxxxxxxxxxxxx> wrote: > Hi All, > > I wondered if any of you could recommend best practices for using a postgres > table as a queue. Roughly speaking, 100-200 workers will vomit rows and > rates of a few hundres per second into the table leaving the status as new > and then as many workers as needed to keep up with the load will plough > through the queue changing the status to something other than new. Is that several hundred per second per worker, or just several hundred per second in total? What happens if the worker finds the database has crashed when it goes to insert the records? That would determine what kind of transactional system you need. On the consumer side, what would happen if a record were processed twice? > > My naive implementation would be something along the lines of: > > CREATE TABLE event ( > ts timestamp, > event char(40), > status char(10), > CONSTRAINT pkey PRIMARY KEY(ts, event) > ); How long are going you keep these records around for once processed? Unless you delete them immediately, you will probably want a partial index on (ts) where status='new' > ...with writers doing INSERT or COPY to get data into the table and readers > doing something like: > > SELECT FOR UPDATE * FROM event WHERE status='new' LIMIT 1000; > > ...so, grabbing batches of 1,000, working on them and then setting their > status. Why 1000 at a time? Also, you probably need an ORDER BY > But, am I correct in thinking that SELECT FOR UPDATE will not prevent > multiple workers selecting the same rows? You are correct, but the way it will prevent multiple workers from selecting them at the same time is that the next worker will block until the first one commits. You would either use need to use SKIP LOCKED in 9.5 release, or you would need to use pg_try_advisory_xact_lock on lower versions, to avoid that. And, how do you unlock the rows? There are two general approaches. One is to lock the row using PostgreSQL's FOR UPDATE type locks, and hold the transaction open while processing, then updating the row to mark it done and committing to release the lock. This cleans up after itself in the case a worker crashes, but there is no visibility into what is going on. The other is to claim the row for the worker by updating a status field (for example, to have a hostname and pid), and committing that. And then doing the processing, then updating it again to set it as done, and committing that. An abnormal terminated worker will need to have someone or something clean up after it, but it gives you much better visibility into what is happening. > > Anyway, is this approach reasonable? If so, what tweaks/optimisations should > I be looking to make? The best optimization would be to not do it at all. Why can't the 200 inserting worker just do the work themselves immediately (either client side or server side), instead of queuing it? If there is any possibility of making the processing fast enough to do it that way, I'd surely spend my time optimizing the actual work, rather than optimizing a queuing system. And depending on what failure modes you can tolerate, consider a best-effort dedicated queuing system rather than a perfectly ACID one built on PostgreSQL. Cheers, Jeff -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general