Re: High-Concurrency GiST in postgreSQL

"C. Mundi" <cmundi@xxxxxxxxx> · Mon, 5 Dec 2011 15:40:08 -0700

Agreed on the importance of understanding the transaction modes.  
I was specifically pointing to the potential latency of blocked reads during splitting nodes on inserting when rebalancing.  But as Paul points out, postgres does Ang/Tan splits.  While less optimal than R* splits, Ang/Tan is faster as I recall.  So it might not be so bad overall.

  And I appreciate the tip to look at pgPool which I didn't know about and will read up.
Thanks,

Carlos

On Dec 5, 2011 3:26 PM, "Andy Colson" <andy@xxxxxxxxxxxxxxx> wrote:

On 12/5/2011 3:41 PM, John R Pierce wrote:

On 12/05/11 1:34 PM, C. Mundi wrote:

So that's my concern. I'm doing 80% reads which are all non-blocking

with 20% writes mixed in, and I need to avoid the effect of writes

blocking queries which do not need to traverse branches affected by

the write.

postgres does no blocking on inserts/updates. the commonest lock is if

you're doing a transaction, and need to select something prior to

updating it, then you use a SELECT ... FOR UPDATE; this locks just the

rows you're going to update so noone else can update them (but other

clients can still read the existing value prior to your COMMIT).

As an addition to this, Reads and Writes wont block each other, but you'll need to watch the overlap if its a problem.  There are many ways to go about it depending on what you want (transaction isolation levels, locking, etc).

In general, I think it might look like:

connection1:

  start transaction

  select * from table where the_geom && POINT(a b)

connection2:

  start transaction

  update table set the_geom = POLYGON(a b c d) where rowid = 5;

connection1: (in the same transaction it started above)

  select the_geom from table where rowid = 5;

  -- gets the origional geom, NOT the one from connection2!

There are transaction options for read committed, read un-committed, etc, etc.  I don't rightly understand them all, but it sounds like you'll want to.

> traverse branches affected by the write

I assume that's a reference to building an underlying tree structure. You wont need to worry about it.  On the other hand, if that's a reference to some geo-boxing thing where one row is included in another and you need to update multiple rows, and I'm starting to confuse myself, then you might have a problem.

Also, as John points out, you'll want a connection pooler.  I've heard good things about pgPool.  It'll also spread read's across multiple computers just incase you need a faster response. (writes go to all computers, read's round-robin).

-Andy