ProcArrayLock (The Saga continues)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Based on feedback after the sessions I did few more tests which might be useful to share

One point that was suggested to get each clients do more work and reduce the number of clients.. The igen benchmarks was flexible and what I did was remove all think time from it and repeated the test till the scalability stops (This was done with CVS downloaded yesterday)

Note with this no think time concept, each clients can be about 75% CPU busy from what I observed. running it I found the clients scaling up saturates at about 60 now (compared to 500 from the original test). The peak throughput was at about 50 users (using synchrnous_commit=off)

Here is the interesting DTrace Lock Ouput state (lock id, mode of lock and time in ns spent waiting for lock in a 10-sec snapshot (Just taking the last few top ones in ascending order):

With less than 20 users it is WALInsert at the top:
52 Exclusive  721950129
4  Exclusive  768537190
46 Exclusive  842063837
7  Exclusive 1031851713

With 35 Users:
52 Exclusive 2599074739
4 Exclusive  2647927574
46 Exclusive 2789581991
7 Exclusive  3220008691

At the peak at about 50 users that I saw earlier (PEAK Throughput):
46 Exclusive  3669210393
4  Exclusive  6024966938
52 Exclusive  6529168107
7  Exclusive  9408290367

With about 60 users where the throughput actually starts to drop (throughput drops)
41 Exclusive   4570660567
52 Exclusive  10706741643
46 Exclusive  13152005125
4 Exclusive  13550187806
7 Exclusive  22146882562


With about 100 users   ( below the peak value)
42 Exclusive    4238582775
46 Exclusive    6773515243
7  Exclusive    7467346038
52 Exclusive    9846216440
4  Shared      22528501166
4  Exclusive  223043774037

So it seems when both shared and exclusive time for ProcArrayLock wait are the top 2 it is basically saturated in terms of throughput it can handle.

Optimizing wait queues will help improve shared which might help Exclusive a bit but eventually Exclusive for ProcArray will limit scaling with as few as 60-70 users.


Lock hold times are below (though taken from different run)
with 30 users:

            Lock Id            Mode   Combined Time (ns)
            1616992       Exclusive           1199791629
                  4       Exclusive           1399371867
                 34       Exclusive           1426153620
            1616978       Exclusive           1528327035
            1616990       Exclusive           1546374298
            1616988       Exclusive           1553461559
                  5       Exclusive           2477558484

With 50+ users
            Lock Id            Mode   Combined Time (ns)
                  4       Exclusive           1438509198
            1616992       Exclusive           1450973466
            1616978       Exclusive           1505626978
            1616990       Exclusive           1850432217
            1616988       Exclusive           2033226225
                 34       Exclusive           2098542547
                  5       Exclusive           3280151374

With 100 users

            Lock Id            Mode   Combined Time (ns)
            1616992       Exclusive           1206516505
            1616988       Exclusive           1486704087
            1616990       Exclusive           1521900997
                 34       Exclusive           1532815803
            1616978       Exclusive           1541986895
                  5       Exclusive           2179043424
                  5                           2395098279

(Why 5 was printing with blank??)
Rerunning it with slight variation of the script


            Lock Id            Mode   Combined Time (ns)
            1616996               0           1167708953
                 36               0           1291958451
                  5      4299305160           1344486968
                  4               0           1347557908
            1616978               0           1377931882
                 34               0           1724752938
                  5               0           2079012548

Looks like trend of 4's hold time looks similar to previous ones.. though the new kid is 5 with mode <> 0,1 .. not sure if that is causing problems..What mode is "4299305160" for Lock 5 (SInvalLock) ? Anyway at this point the wait time for 4 increases to a point where the database is not scaling anymore

any thoughts?


-Jignesh





[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux