Re: Proposal of tunable fix for scalability of 8.4

Gregory Stark <stark@xxxxxxxxxxxxxxxx> · Fri, 13 Mar 2009 13:28:36 +0000

"Jignesh K. Shah" <J.K.Shah@xxxxxxx> writes:

> Scott Carey wrote:
>> On 3/12/09 11:37 AM, "Jignesh K. Shah" <J.K.Shah@xxxxxxx> wrote:
>>
>> In general, I suggest that it is useful to run tests with a few different
>> types of pacing. Zero delay pacing will not have realistic number of
>> connections, but will expose bottlenecks that are universal, and less
>> controversial
>
> I think I have done that before so I can do that again by running the users at
> 0 think time which will represent a "Connection pool" which is highly utilized"
> and test how big the connection pool can be before the throughput tanks.. This
> can be useful for App Servers which sets up connections pools of their own
> talking with PostgreSQL.

A minute ago I said:

  Keep in mind when you do this that it's not interesting to test a number of
  connections much larger than the number of processors you have. Once the
  system reaches 100% cpu usage it would be a misconfigured connection pooler
  that kept more than that number of connections open.

Let me give another reason to call this misconfigured: Postgres connections
are heavyweight and it's wasteful to keep them around but idle. This has a lot
in common with the issue with non-persistent connections where each connection
is used for only a short amount of time.

In Postgres each connection requires a process, which limits scalability on a
lot of operating systems already. On many operating systems having thousands
of processes in itself would create a lot of issues.

Each connection then allocates memory locally for things like temporary table
buffers, sorting, hash tables, etc. On most operating systems this memory is
not freed back to the system when it hasn't been used recently. (Worse, it's
more likely to be paged out and have to be paged in from disk even if it
contains only garbage we intend to overwrite!).

As a result, having thousands of processes --aside from any contention-- would
lead to inefficient use of system resources. Consider for example that if your
connections are using 1MB each then a thousand of them are using 1GB of RAM.
When only 64MB are actually useful at any time. I bet that 64MB would fit
entirely in your processor caches you weren't jumping around in the gigabyte
of local memory your thousands of processes' have allocated.

Consider also that you're limited to setting relatively small settings of
work_mem for fear all your connections might happen to start a sort
simultaneously. So (in a real system running arbitrary queries) instead of a
single quicksort in RAM you'll often be doing unnecessary on-disk merge sorts
using unnecessarily small merge heaps while gigabytes of RAM either go wasted
to cover a rare occurrence or are being used to hold other sorts which have
been started but context-switched away.

To engineer a system intended to handle thousands of simultaneous connections
you would want each backend to use the most light-weight primitives such as
threads, and to hold the least possible state in local memory. That would look
like quite a different system. The locking contention is the least of the
issues we would want to deal with to get there.

-- 
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance