Search Postgresql Archives

Re: PostgreSQL question re select for update (locking??)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Reid Thompson wrote:

Each server process claims a jobq record by selecting for update a
jobq record where the pid column is null, then rewrites the record with
the pid set in the pid column.

The "distilled" sql select statement is:

    * SELECT J.*, C.name, C.client_id, C.priority
    * FROM jobq J, campaign C
    * WHERE J.pid IS NULL
    * AND 'my_YYYYMMDDhhmmss'>=J.due_date
    * AND J.campaign_id=C.id
    * ORDER BY C.priority,J.due_date,J.item_id
    * LIMIT 1 FOR UPDATE;


What we are seeing is:

- We dump hundreds of records into the jobq table, expecting that a
  server process will each claim a record (and process it), but some of
  the processes call the claim function but get nothing returned by the
  select; they sleep for a short time (eg 10 seconds - this sleep
  differs on a per process basis) then attempt to claim again.
  Sometimes this claim returns a jobq record, other times it may take 4
  or 5 attempted claims before a record is actually returned by the
  select.

  Our expectation was that with hundreds of selectable records available,
that each server process would 'get' the next available record -- Is this a false expectation on our part?
  Is the currently 'active' select for update limit 1 blocking the others?
  If not, can someone give us some direction..??

The problem is that the order of events isn't quite what you expect.

You can think of the SELECT ... LIMIT 1 as identifying a set containing one row. Since it's FOR UPDATE, the backend tries to get a lock, and if necessary waits until it can. Of course, all the other processors are doing the same thing.

When another process updates that row, the backend re-checks the conditions and you end up with a set of zero rows, which it helpfully gains a lock on and then returns to you!

So - what to do? Well, there are several options.

1. Keep the lock/update period short by just marking the row as "claimed" and committing. You still need to loop trying to get a job to process, but the process is smoother because the possible window for conflicts is short.

2. Select "random" jobs from the list if you don't care about order. Again, you're just making things smoother, not removing the requirement to loop checking.

3. Have a "master" process that allocates jobs to processors. Or simulate such a thing by having processors call an update_jobs() function. That way you can pre-allocate jobs to processors (at perhaps a cost in efficiency if the number of jobs in-queue falls below the number of processors).

--
  Richard Huxton
  Archonet Ltd


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux