On Thu, Jun 4, 2009 at 2:04 PM, Scott Carey <scott@xxxxxxxxxxxxxxxxx> wrote: > To clarify if needed: > > I'm not saying the two issues are unrelated. I'm saying that the > relationship between connection pooling and a database is multi-dimensional, > and the scalability improvement does not have a hard dependency on > connection pooling. > > On one spectrum, you have the raw performance improvement by caching > connections so they do not need to be created and destroyed frequently. > This is a universal benefit to all databases, though some have higher > overhead of connection creation than others. Any book on databases > mentioning connection pools will list this benefit. > > On another spectrum, a connection pool can act as a concurrency throttle. > The benefit of such a thing varies greatly from database to database, but > the trend for each DB out there has been to solve this issue internally and > not trust client or third party tools to prevent concurrency/scalability > related disasters. > > The latter should be treated separately, a solution to it does not have to > address the connection creation/destruction efficiency -- almost all clients > these days can do that part, and third party tools are simpler if they only > have to meet that goal and not also try and reduce idle connection count. > > So a fix to the connection scalability issues only optionally involves what > most would call connection pooling. > > ------- > Postgres' MVCC nature has something to do with it, but I'm sure there are > ways to significantly improve the current situation. Locks and processor > cache-line behavior on larger SMP systems are often strangely behaving > beasts. I think in the particular case of PostgreSQL the only suggestions I've heard for improving performance with very large numbers of simultaneous connections are (1) connection caching, not so much because of the overhead of creating the connection as because it involves creating a whole new process whose private caches start out cold, (2) finding a way to reduce ProcArrayLock contention, and (3) reducing the cost of deriving a snapshot. I think (2) and (3) are related but I'm not sure how closely. As far as I know, Simon is the only one to submit a patch in this area and I think I'm not being unfair if I say that that particular patch is mostly nibbling around the edges of the problem. There was a discussion a few months ago on some possible changes to the lock modes of ProcArrayLock, based I believe on some ideas from Tom (might have been Heikki), but I don't think anyone has coded that or tested it. We probably won't be able to make significant improvements in this area unless someone comes up with some new, good ideas. I agree with you that there are probably ways to significantly improve the current situation, but I'm not sure anyone has figured out with any degree of specificity what they are. ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance