On 3/13/09 9:42 AM, "Jignesh K. Shah" <J.K.Shah@xxxxxxx> wrote:
Is this the server with 128 thread capability or 64 threads? Idle time is reduced but other locks are hit.
Now with a modified Fix (not the original one that I proposed but
something that works like a heart valve : Opens and shuts to minimum
default way thus controlling how many waiters are waked up )
With 200ms sleeps, no lock change:
Peak throughput 102000/min @ 1000 users.avg response time is 23ms. Linear ramp up until 900 users @98000/min and 12ms response time.
At 2000 users, response time is 229ms and throughput is 90000/min.
With 200ms sleeps, lock modification 1 (wake all)
Peak throughput at 1701112/min @2000 users and avg response time 63ms. Plateau starts at 1600 users and 160000/min throughput. As before, plateau starts when response time breaches 20ms, indicating contention.
Lets call the above a 65% throughput improvement with large connection count.
-----------------
Now, with 0ms delay, no threading change:
Throughput is 136000/min @184 users, response time 13ms. Response time has not jumped too drastically yet, but linear performance increases stopped at about 130 users or so. ProcArrayLock busy, very busy. CPU: 35% user, 11% system, 54% idle
With 0ms delay, and lock modification 2 (wake some, but not all)
Throughput is 161000/min @328 users, response time 28ms. At 184 users as before the change, throughput is 147000/min with response time 0.12ms. Performance scales linearly to 144 users, then slows down and slightly increases after that with more concurrency.
Throughput increase is between 15% and 25%.
What I see in the above is twofold:
This change improves throughput on this machine regardless of connection count.
The change seems to help with more connection count and the wait — in fact, it seems to make connection count at this level not be much of a factor at all.
The two changes tested are different, which clouds things a bit. I wonder what the first change would do in the second test case.
In any event, the second detail above is facinating — it suggests that these locks are what is responsible for a significant chunk of the overhead of idle or mostly idle connections (making connection pools less useful, though they can never fix mid-transaction pauses which are very common). And in any event, on large multiprocessor systems like this postgres is lock limited regardless of using a connection pool or not.