Re: Proposal of tunable fix for scalability of 8.4

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 






In general, I suggest that it is useful to run tests with a few different types of pacing. Zero delay pacing will not have realistic number of connections, but will expose bottlenecks that are universal, and less controversial. Small latency (100ms to 1s) tests are easy to make from the zero delay ones, and help expose problems with connection count or other forms of ‘non-active’ concurrency. End-user realistic delays are app specific, and useful with larger holistic load tests (say, through the application interface). Generally, running them in this order helps because at each stage you are adding complexity. Based on your explanations, you’ve probably done much of this so far and your approach sounds solid to me. If the first case fails (zero delay, smaller user count), there is no way the others will pass.



I think I have done that before so I can do that again by running the users at 0 think time which will represent a "Connection pool" which is highly utilized" and test how big the connection pool can be before the throughput tanks.. This can be useful for App Servers which sets up connections pools of their own talking with PostgreSQL.

-Jignesh


So I backed out my change and used the stock 8.4 snapshot that I had downloaded.. With now 0 think time I do runs with lot less users.. still I cannot get it to go to 100% CPU
60: 8: Medium Throughput: 7761.000 Avg Medium Resp: 0.004
120: 16: Medium Throughput: 16876.000 Avg Medium Resp: 0.004
180: 24: Medium Throughput: 25359.000 Avg Medium Resp: 0.004
240: 32: Medium Throughput: 33104.000 Avg Medium Resp: 0.005
300: 40: Medium Throughput: 42200.000 Avg Medium Resp: 0.005
360: 48: Medium Throughput: 49996.000 Avg Medium Resp: 0.005
420: 56: Medium Throughput: 58260.000 Avg Medium Resp: 0.005
480: 64: Medium Throughput: 66289.000 Avg Medium Resp: 0.005
540: 72: Medium Throughput: 74667.000 Avg Medium Resp: 0.005
600: 80: Medium Throughput: 82632.000 Avg Medium Resp: 0.005
660: 88: Medium Throughput: 90211.000 Avg Medium Resp: 0.006
720: 96: Medium Throughput: 98236.000 Avg Medium Resp: 0.006
780: 104: Medium Throughput: 105517.000 Avg Medium Resp: 0.006
840: 112: Medium Throughput: 112921.000 Avg Medium Resp: 0.006
900: 120: Medium Throughput: 118256.000 Avg Medium Resp: 0.007
960: 128: Medium Throughput: 126499.000 Avg Medium Resp: 0.007
1020: 136: Medium Throughput: 133354.000 Avg Medium Resp: 0.007
1080: 144: Medium Throughput: 135826.000 Avg Medium Resp: 0.008
1140: 152: Medium Throughput: 121729.000 Avg Medium Resp: 0.012
1200: 160: Medium Throughput: 130487.000 Avg Medium Resp: 0.011
1260: 168: Medium Throughput: 123368.000 Avg Medium Resp: 0.013
1320: 176: Medium Throughput: 134649.000 Avg Medium Resp: 0.012
1380: 184: Medium Throughput: 136272.000 Avg Medium Resp: 0.013


Vmstat shows that CPUS are hardly busy in the 64-cpu system (CPUS are reported busy when there is active process assigned to the cpu)
-bash-3.2$ vmstat 30
kthr      memory            page            disk          faults      cpu
r b w swap free re mf pi po fr de sr s0 s1 s2 sd in sy cs us sy id 19 0 0 52691088 46220848 27 302 10 68 68 0 3 1 -0 -0 -0 13411 20762 26854 5 3 92 0 0 0 45095664 39898296 0 455 0 0 0 0 0 0 0 0 0 698 674 295 0 0 100 0 0 0 45040640 39867056 5 13 0 0 0 0 0 0 0 0 0 3925 4189 5721 0 0 99 0 0 0 45038856 39864016 0 5 0 0 0 0 0 0 0 0 0 9479 8643 15205 1 1 98 0 0 0 45037760 39862552 0 14 0 0 0 0 0 0 0 0 0 12088 9041 19890 2 1 98 0 0 0 45035960 39860080 0 6 0 0 0 0 0 0 0 0 0 16590 11611 28351 2 1 97 0 0 0 45034648 39858416 0 17 0 0 0 0 0 0 0 0 0 19192 13027 33218 3 1 96 0 0 0 45032360 39855464 0 10 0 0 0 0 0 0 0 0 0 22795 16467 40392 4 1 95 0 0 0 45030840 39853568 0 22 0 0 0 0 0 0 0 0 0 25349 18315 45178 4 1 94 0 0 0 45027456 39849648 0 10 0 0 0 0 0 0 0 0 0 28158 22500 50804 5 2 93 0 0 0 45000752 39832608 0 38 0 0 0 0 0 0 0 0 0 31332 25744 56751 6 2 92 0 0 0 45010120 39836728 0 6 0 0 0 0 0 0 0 0 0 36636 29334 66505 7 2 91 0 0 0 45017072 39838504 0 29 0 0 0 0 0 0 0 0 0 38553 32313 70915 7 2 91 0 0 0 45011384 39833768 0 11 0 0 0 0 0 0 0 0 0 41186 35949 76275 8 3 90 0 0 0 44890552 39826136 0 40 0 0 0 0 0 0 0 0 0 45123 44507 83665 9 3 88 0 0 0 44882808 39822048 0 6 0 0 0 0 0 0 0 0 0 49342 53431 91783 10 3 87 0 0 0 45003328 39825336 0 42 0 0 0 0 0 0 0 0 0 48516 42515 91135 10 3 87 0 0 0 44999688 39821008 0 6 0 0 0 0 0 0 0 0 0 54695 48741 102526 11 3 85
kthr      memory            page            disk          faults      cpu
r b w swap free re mf pi po fr de sr s0 s1 s2 sd in sy cs us sy id 0 0 0 44980744 39806400 0 55 0 0 0 0 0 0 0 0 0 54968 51946 103245 12 4 84 0 0 0 44992288 39812256 0 6 0 1 1 0 0 0 0 0 0 60506 58205 113911 13 4 83 0 0 0 44875648 39802128 1 60 0 0 0 0 0 1 0 0 0 60485 66576 114081 13 4 83 0 0 0 44848792 39795008 0 8 0 0 0 0 0 1 0 0 0 66760 75060 126202 15 5 80 0 0 0 44837168 39786432 0 57 0 0 0 0 0 0 0 0 0 66015 68256 125209 15 4 81 1 0 0 44832680 39779064 0 7 0 0 0 0 0 0 0 0 0 72728 79089 138077 17 5 79 1 0 0 44926640 39773160 0 69 0 0 0 0 0 0 0 0 0 71990 79148 136786 17 5 78 1 0 0 44960800 39781416 0 6 0 0 0 0 0 0 0 0 0 75442 77829 143783 18 5 77 1 0 0 44846472 39773960 0 68 0 0 0 0 0 0 0 0 0 80395 97964 153336 19 6 75 1 0 0 44887168 39770680 0 7 0 0 0 0 0 0 0 0 0 80010 88144 152699 19 6 75 1 0 0 44951152 39769576 0 68 0 0 0 0 0 0 0 0 0 83670 85394 159745 20 6 74 1 0 0 44946080 39763120 0 7 0 0 0 0 0 0 0 0 0 85416 91961 163147 21 6 73 1 0 0 44923928 39744640 0 83 0 0 0 0 0 0 0 0 0 87625 104894 167412 22 6 71 1 0 0 44929704 39745368 0 7 0 0 0 0 0 0 0 0 0 93280 103922 178357 24 7 69 1 0 0 44822712 39738744 0 82 0 0 0 0 0 0 0 0 0 91739 113747 175232 23 7 70 1 0 0 44790040 39730168 0 6 0 0 0 0 0 0 0 0 0 96159 122496 183642 25 7 68 1 0 0 44868808 39733872 0 82 0 0 0 0 0 0 0 0 0 96166 107465 183502 25 7 68 2 0 0 44913296 39730272 0 6 0 0 0 0 0 0 0 0 0 103573 114064 197502 27 8 65 1 0 0 44890768 39712424 0 96 0 0 0 0 0 0 0 0 0 102235 123767 194747 28 8 64
kthr      memory            page            disk          faults      cpu
r b w swap free re mf pi po fr de sr s0 s1 s2 sd in sy cs us sy id 2 0 0 44900096 39716808 0 6 0 0 0 0 0 0 0 0 0 97323 112955 185647 27 8 65 1 0 0 44793360 39708336 0 94 0 0 0 0 0 0 0 0 0 98631 131539 188076 27 8 65 2 0 0 44765136 39700536 0 8 0 0 0 0 0 0 0 0 0 90489 117037 172603 27 8 66 1 0 0 44887392 39700024 0 94 0 0 0 0 0 0 0 0 0 95832 106992 182677 27 8 65 2 0 0 44881856 39692632 0 6 0 0 0 0 0 0 0 0 0 95015 109679 181194 27 8 65 1 0 0 44860928 39674856 0 110 0 0 0 0 0 0 0 0 0 92909 119383 177459 27 8 65 1 0 0 44861320 39671704 0 8 0 0 0 0 0 0 0 0 0 94677 110967 180832 28 8 64 1 0 0 44774424 39676000 0 108 0 0 0 0 0 0 0 0 0 94953 123457 181397 27 8 65 1 0 0 44733000 39668528 0 6 0 0 0 0 0 0 0 0 0 100719 132038 192550 29 9 63 1 0 0 44841888 39668864 0 106 0 0 0 0 0 0 0 0 0 97293 109177 185589 28 8 64 1 0 0 44858976 39663592 0 6 0 0 0 0 0 0 0 0 0 103199 118256 197049 30 9 62 1 0 0 44837216 39646416 0 122 0 0 0 0 0 0 0 0 0 105637 133127 201788 31 9 60 1 0 0 44842624 39647232 0 8 0 0 0 0 0 0 0 0 0 110530 131454 211139 32 9 59 2 0 0 44740624 39638832 1 127 0 0 0 0 0 0 0 0 0 111114 145135 212398 32 9 59 2 0 0 44690824 39628568 0 8 0 0 0 0 0 0 0 0 0 109934 146164 210454 32 10 59 2 0 0 44691912 39616000 0 132 0 0 0 0 0 0 0 0 0 108231 132279 206885 32 9 59 1 0 0 44797968 39609832 0 9 0 0 0 0 0 0 0 0 0 111582 135125 213446 33 10 58 3 0 0 44781632 39598432 0 135 0 0 0 0 0 0 0 0 0 115277 150356 220792 34 10 56 5 0 0 44791408 39600432 0 10 0 0 0 0 0 0 0 0 0 111428 137996 212559 33 9 58
kthr      memory            page            disk          faults      cpu
r b w swap free re mf pi po fr de sr s0 s1 s2 sd in sy cs us sy id 3 0 0 44710008 39603320 0 135 0 0 0 0 0 0 0 0 0 110564 145678 211567 33 10 57 5 0 0 44663368 39595008 0 6 0 0 0 0 0 0 0 0 0 108891 143083 208389 33 10 58 3 0 0 44753496 39593824 0 132 0 0 0 0 0 0 0 0 0 109922 126865 209869 33 9 57 4 0 0 44788368 39588528 0 7 0 0 0 0 0 0 0 0 0 108680 129073 208068 33 10 57 2 0 0 44767920 39570592 0 147 0 0 0 0 0 0 0 0 0 106671 142403 204724 33 10 58 4 0 0 44762656 39563256 0 11 0 0 0 0 0 0 0 0 0 106185 130328 204551 34 10 57 2 0 0 44674584 39560912 0 148 0 0 0 0 0 0 0 0 0 104757 139147 201448 32 10 58 1 0 0 44619824 39551024 0 9 0 0 0 0 0 0 0 0 0 103653 142125 199896 32 10 58 2 0 0 44622480 39552432 0 141 0 0 0 0 0 0 0 0 0 101373 134547 195553 32 9 58 1 0 0 44739936 39552312 0 11 0 0 0 0 0 0 0 0 0 102932 121742 198205 33 9 58


And lock stats are as follows at about 280 users sampling for a single backend process:
# ./84_lwlock.d 29405

              Lock Id            Mode           State           Count
         WALWriteLock       Exclusive        Acquired               1
           XidGenLock       Exclusive         Waiting               1
      CLogControlLock          Shared         Waiting               3
        ProcArrayLock          Shared         Waiting               7
      CLogControlLock       Exclusive         Waiting               9
        WALInsertLock       Exclusive         Waiting              45
      CLogControlLock          Shared        Acquired              52
        ProcArrayLock       Exclusive         Waiting              61
           XidGenLock       Exclusive        Acquired              96
        ProcArrayLock       Exclusive        Acquired              97
      CLogControlLock       Exclusive        Acquired             152
        WALInsertLock       Exclusive        Acquired             302
        ProcArrayLock          Shared        Acquired             729
     FirstLockMgrLock          Shared        Acquired             812
  FirstBufMappingLock          Shared        Acquired             857
       SInvalReadLock          Shared        Acquired            1551

              Lock Id            Mode           State   Combined Time (ns)
        WALInsertLock                        Acquired                89909
           XidGenLock       Exclusive         Waiting               101488
         WALWriteLock       Exclusive        Acquired               140563
      CLogControlLock          Shared         Waiting               354756
  FirstBufMappingLock                        Acquired               471438
     FirstLockMgrLock                        Acquired              2907141
           XidGenLock       Exclusive        Acquired              7450934
      CLogControlLock       Exclusive         Waiting             11094716
        ProcArrayLock                        Acquired             15495229
        WALInsertLock       Exclusive         Waiting             20801169
      CLogControlLock       Exclusive        Acquired             21339264
       SInvalReadLock                        Acquired             24309991
     FirstLockMgrLock       Exclusive        Acquired             39904071
  FirstBufMappingLock       Exclusive        Acquired             40826435
        ProcArrayLock          Shared         Waiting             86352947
        WALInsertLock       Exclusive        Acquired             89336432
       SInvalReadLock       Exclusive        Acquired            252574515
        ProcArrayLock       Exclusive        Acquired            315064347
        ProcArrayLock       Exclusive         Waiting            847806215

mpstat outputs is too much so I am doing aggegation by procesor set which is all 64 cpus

-bash-3.2$ mpstat -a 10

SET minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl sze 0 370 0 118649 127575 7595 244456 43931 62166 8700 0 158929 38 11 0 50 64 0 167 0 119668 128704 7644 246389 43287 62357 8816 0 161006 38 11 0 51 64 0 27 0 109461 117433 6997 224514 38562 56446 8171 0 148322 34 10 0 56 64 0 2 0 122368 131549 7871 250237 39620 61478 9082 0 165995 36 11 0 52 64 0 0 0 122025 131380 7973 249429 37292 59863 8922 0 166319 35 11 0 54 64

(quick overview of columns )
    SET    Processor set
    minf          minor faults
    mjf           major faults
    xcal          inter-processor cross-calls
    intr          interrupts
    ithr          interrupts  as  threads  (not  counting  clock
                  interrupt)
    csw           context switches
    icsw          involuntary context switches
    migr          thread migrations (to another processor)
    smtx          spins on mutexes (lock not acquired  on  first
                  try)
    srw           spins  on  readers/writer  locks   (lock   not
                  acquired on first try)
    syscl         system calls
    usr           percent user time
    sys           percent system time
    wt            the I/O wait time is no longer calculated as a
                  percentage  of  CPU  time,  and this statistic
                  will always return zero.
    idl           percent idle time
    sze           number of processors in the requested  proces-
                  sor set


-Jignesh


--
Jignesh Shah           http://blogs.sun.com/jkshah  			
The New Sun Microsystems,Inc   http://sun.com/postgresql


--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux