"Jignesh K. Shah" <J.K.Shah@xxxxxxx> writes: > Scott Carey wrote: >> On 3/12/09 11:37 AM, "Jignesh K. Shah" <J.K.Shah@xxxxxxx> wrote: >> >> In general, I suggest that it is useful to run tests with a few different >> types of pacing. Zero delay pacing will not have realistic number of >> connections, but will expose bottlenecks that are universal, and less >> controversial > > I think I have done that before so I can do that again by running the users at > 0 think time which will represent a "Connection pool" which is highly utilized" > and test how big the connection pool can be before the throughput tanks.. This > can be useful for App Servers which sets up connections pools of their own > talking with PostgreSQL. A minute ago I said: Keep in mind when you do this that it's not interesting to test a number of connections much larger than the number of processors you have. Once the system reaches 100% cpu usage it would be a misconfigured connection pooler that kept more than that number of connections open. Let me give another reason to call this misconfigured: Postgres connections are heavyweight and it's wasteful to keep them around but idle. This has a lot in common with the issue with non-persistent connections where each connection is used for only a short amount of time. In Postgres each connection requires a process, which limits scalability on a lot of operating systems already. On many operating systems having thousands of processes in itself would create a lot of issues. Each connection then allocates memory locally for things like temporary table buffers, sorting, hash tables, etc. On most operating systems this memory is not freed back to the system when it hasn't been used recently. (Worse, it's more likely to be paged out and have to be paged in from disk even if it contains only garbage we intend to overwrite!). As a result, having thousands of processes --aside from any contention-- would lead to inefficient use of system resources. Consider for example that if your connections are using 1MB each then a thousand of them are using 1GB of RAM. When only 64MB are actually useful at any time. I bet that 64MB would fit entirely in your processor caches you weren't jumping around in the gigabyte of local memory your thousands of processes' have allocated. Consider also that you're limited to setting relatively small settings of work_mem for fear all your connections might happen to start a sort simultaneously. So (in a real system running arbitrary queries) instead of a single quicksort in RAM you'll often be doing unnecessary on-disk merge sorts using unnecessarily small merge heaps while gigabytes of RAM either go wasted to cover a rare occurrence or are being used to hold other sorts which have been started but context-switched away. To engineer a system intended to handle thousands of simultaneous connections you would want each backend to use the most light-weight primitives such as threads, and to hold the least possible state in local memory. That would look like quite a different system. The locking contention is the least of the issues we would want to deal with to get there. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support! -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance