On Thu, 4 Jun 2009 06:57:57 -0400, Robert Haas <robertmhaas@xxxxxxxxx> wrote in http://archives.postgresql.org/pgsql-performance/2009-06/msg00065.php : > I think I see the distinction you're drawing here. IIUC, you're > arguing that other database products use connection pooling to handle > rapid connect/disconnect cycles and to throttle the number of > simultaneous queries, but not to cope with the possibility of large > numbers of idle sessions. My limited understanding of why PostgreSQL > has a problem in this area is that it has to do with the size of the > process array which must be scanned to derive an MVCC snapshot. I'd > be curious to know if anyone thinks that's correct, or not. > > Assuming for the moment that it's correct, databases that don't use > MVCC won't have this problem, but they give up a significant amount of > scalability in other areas due to increased blocking (in particular, > writers will block readers). So how do other databases that *do* use > MVCC mitigate this problem? I apologize if it is bad form to respond to a message that is two months old, but I did not see this question answered elsewhere and thought it would be helpful to have it answered. This my rough understanding. Oracle never "takes" a snapshot, it computes one the fly, if and when it is needed. It maintains a structure of recently committed transactions, with the XID for when they committed. If a process runs into a tuple that is neither from the future nor from the deep past, it consults this structure to see if that transaction has committed, and if so whether it did so before or after the current query was started. The structure is partionable so it does not have one global lock to serialize on, and the lock is short as it only gets the info it needs, not the entire set of global open transactions. > The only one that we've discussed here is > Oracle, which seems to get around the problem by having a built-in > connection pooler. There are several reasons to have something like Oracle's shared server (or whatever they call it now), and I don't think global serialization on snapshots is high among them, at least not for Oracle. With shared server, you can (theoretically) control memory usage so that 10,000 independent processes don't all decide to do a large in-memory sort or hash join at the same time. It is also a bit more than a standard connection pooler, because multiple connections can be in the middle of non-read-only transactions on the same backend at the same time. I don't think client-based pools allow that. Jeff -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance