On 03/12/09 13:48, Scott Carey wrote: On 3/11/09 7:47 PM, "Tom Lane" <tgl@xxxxxxxxxxxxx> wrote: All I’m adding, is that it makes some sense to me based on my experience in CPU / RAM bound scalability tuning. It was expressed that the test itself didn’t even make sense. SSDs are precisely my motivation of doing RAM based tests with PostgreSQL. While I am waiting for my SSDs to arrive, I started to emulate SSDs by putting the whole database on RAM which in sense are better than SSDs so if we can tune with RAM disks then SSDs will be covered. What we have is a pool of 2000 users and we start making each user do series of transactions on different rows and see how much the database can handle linearly before some bottleneck (system or database) kicks in and there can be no more linear increase in active users. Many times there is drop after reaching some value of active users. If all 2000 users can scale linearly then another test with say 2500 can be executed .. All to do is what's the limit we can go till typically there are no system resources still remaining to be exploited. That said the testkit that I am using is a lightweight OLTP typish workload which a user runs against a preknown schema and between various transactions that it does it emulates a wait time of 200ms. That said it is some sense emulating a real user who clicks and then waits to see what he got and does another click which results in another transaction happening. (Not exactly but you get the point). Like all workloads it is generally used to find bottlenecks in systems before putting production stuff on it. That said my current environment I am having similar workloads and seeing how many users can go to the point where system has no more CPU resources available to do a linear growth in tpm. Generally as many of you mentioned you will see disk latency, network latency, cpu resource problems, etc.. And thats the work I am doing right now.. I am working around network latency by doing a private network, improving Operating systems tunables to improve efficiency out there.. I am improving disk latency by putting them on /RAM (and soon on SSDs).. However if I still cannot consume all CPU then it means I am probably hit by locks . Using PostgreSQL DTrace probes I can see what's happening.. At low user (100 users) counts my lock profiles from a user point of view are as follows: # dtrace -q -s 84_lwlock.d 1764 Lock Id Mode State Count ProcArrayLock Shared Waiting 1 CLogControlLock Shared Acquired 2 ProcArrayLock Exclusive Waiting 3 ProcArrayLock Exclusive Acquired 24 XidGenLock Exclusive Acquired 24 FirstLockMgrLock Shared Acquired 25 CLogControlLock Exclusive Acquired 26 FirstBufMappingLock Shared Acquired 55 WALInsertLock Exclusive Acquired 75 ProcArrayLock Shared Acquired 178 SInvalReadLock Shared Acquired 378 Lock Id Mode State Combined Time (ns) SInvalReadLock Acquired 29849 ProcArrayLock Shared Waiting 92261 ProcArrayLock Acquired 951470 FirstLockMgrLock Exclusive Acquired 1069064 CLogControlLock Exclusive Acquired 1295551 ProcArrayLock Exclusive Waiting 1758033 FirstBufMappingLock Exclusive Acquired 2078507 XidGenLock Exclusive Acquired 3460800 WALInsertLock Exclusive Acquired 12205466 SInvalReadLock Exclusive Acquired 42684236 ProcArrayLock Exclusive Acquired 57397139 As users grow beyond 1000 it changes to the following for the sample user point of view # dtrace -q -s 84_lwlock.d 1764 Lock Id Mode State Count CLogControlLock Exclusive Waiting 1 WALInsertLock Exclusive Waiting 1 ProcArrayLock Exclusive Acquired 7 XidGenLock Exclusive Acquired 7 ProcArrayLock Exclusive Waiting 10 CLogControlLock Shared Acquired 13 WALInsertLock Exclusive Acquired 23 CLogControlLock Exclusive Acquired 30 ProcArrayLock Shared Acquired 50 FirstLockMgrLock Shared Acquired 104 SInvalReadLock Shared Acquired 105 FirstBufMappingLock Shared Acquired 106 Lock Id Mode State Combined Time (ns) WALInsertLock Exclusive Waiting 73990 CLogControlLock Exclusive Waiting 383066 XidGenLock Exclusive Acquired 408301 CLogControlLock Exclusive Acquired 1871642 ProcArrayLock Acquired 2825372 WALInsertLock Exclusive Acquired 3144580 FirstLockMgrLock Exclusive Acquired 3799818 FirstBufMappingLock Exclusive Acquired 4083473 SInvalReadLock Exclusive Acquired 20611120 ProcArrayLock Exclusive Acquired 37920098 ProcArrayLock Exclusive Waiting 3783942020 Thats similar to what I had seen last year.. But thats the reason I am playing with lwlock.c to see how changing of how LWLockRelease() can be modified to do different types of wake-ups have impact on this top waiting time which is basically waste of time from perspective of application, operating system, cpu . All I am saying is with tuning flexibility we can actually reduce the time wasted and probably use that time with acquired state while it is doing some useful work. I dont think I have misconfigured the system. I am just showing that hey there are ways to cut down some inefficiencies here and showing test points. I am also showing where it does seem to help performance. It may not help in all case but I just gave you a test where it helps performance where it is better than what it is. And again this is the third time I am saying.. the test users also have some latency build up in them which is what generally is exploited to get more users than number of CPUS on the system but that's the point we want to exploit.. Otherwise if all new users begin to do their job with no latency then we would need 6+ billion cpus to handle all possible users. Typically as an administrator (System and database) I can only tweak/control latencies within my domain, that is network, disk, cpu's etc and those are what I am tweaking and coming to a *Configured* environment and now trying to improve lock contentions/waits in PostgreSQL so that we have an optimized setup. I am trying another run where I limit the waked up threads to a pre-configured number to see how various numbers pans out in terms of throughput on this server. Regards, Jignesh |