Search Postgresql Archives

Re: is there a select for update insert if not exist type command?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 9 Jun 2012 15:41:34 -0700 Evan Rempel <erempel@xxxxxxx> wrote:

> I have a project where I will have two clients essentially doing the
> same things at the same time. The idea is that if one has already done the
> work, then the second one does not need to do it.
> 
> I was hoping that adding a task related unique identifier to a table
> could be used to coordinate these client, something like a primary key and using
> select for update.
> 
> The challenge I have is during the initial insert. One of the two clients will cause postgresql
> to log an error, which I would rather avoid (just seems dirty).
> 
> Here is the time line;
> 
> Both clients A and B becomes aware to do a task
> 
> Client A or client B issues the "select for update ... if not exist do insert" type command
> The other client gets blocked on the "select for update.
> 
> First client finishes insert/updates to record that it has delt with the task
> 
> second client gets unblocked and reads the record realizing that the first client delt with the task already.
> 
> 
> It is the "select for update ... if not exist do insert" type command that I am ignorant of how to code.
> 
> Anyone care to school me?

It's amazing to me how often I have this conversation ...

How would you expect SELECT FOR UPDATE to work when you're checking to see
if you can insert a row?  If the row doesn't exist, there's nothing to
lock against, and thus it doesn't help anything.  FOR UPDATE is only
useful if you're UPDATING a row.

That being given, there are a number of ways to solve your problem.  Which
one you use depends on a number of factors.

If it's x number of processes all contending for one piece of work, you could
just exclusive lock the entire table, and do the check/insert with the
table locked.  This essentially creates a wait queue.

If the processes need to coordinate around doing several pieces of work, you
can put a row in for each piece of work with a boolean field indicating
whether a process is currently working on it.  Then you can SELECT FOR
UPDATE a particular row representing work to be done, and if the boolean
isn't already true, set it to true and start working.  In my experience,
you'll benefit from going a few steps forward and storing some information
about what's being done on it (like the PID of the process working on it,
and the time it started processing) -- it just makes problems easier to
debug later.

There are other approaches as well, but those are the two that come to
mind.

Not sure what your experience level is, but I'll point out that these
kinds of things only work well if you're transaction management is
correct.  I have seen people struggle to get these kind of things working
because they didn't really understand how transactions and locking interact,
or they were using some sort of abstraction layer that does transaction
stuff in such an opaque way that they couldn't figure out what was actually
happening.

Hope this helps.

-- 
Bill Moran <wmoran@xxxxxxxxxxxxxxxxx>

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux