On 2014-07-01 21:48:35 +1200, Mark Kirkwood wrote: > On 27/06/14 21:19, Andres Freund wrote: > >On 2014-06-27 14:28:20 +1200, Mark Kirkwood wrote: > >>My feeling is spinlock or similar, 'perf top' shows > >> > >>kernel find_busiest_group > >>kernel _raw_spin_lock > >> > >>as the top time users. > > > >Those don't tell that much by themselves, could you do a hierarchical > >profile? I.e. perf record -ga? That'll at least give the callers for > >kernel level stuff. For more information compile postgres with > >-fno-omit-frame-pointer. > > > > Unfortunately this did not help - had lots of unknown symbols from postgres > in the profile - I'm guessing the Ubuntu postgresql-9.3 package needs either > the -dev package or to be rebuilt with the enable profile option (debug and > no-omit-frame-pointer seem to be there already). You need to install the -dbg package. My bet is you'll see s_lock high in the profile, called mainly from the procarray and buffer mapping lwlocks. > Test: pgbench > Options: scale 500 > read only > Os: Ubuntu 14.04 > Pg: 9.3.4 > Pg Options: > max_connections = 200 Just as an experiment I'd suggest increasing max_connections by one and two and quickly retesting - there's some cacheline alignment issues that aren't fixed yet that happen to vanish with some max_connections settings. > shared_buffers = 10GB > maintenance_work_mem = 1GB > effective_io_concurrency = 10 > wal_buffers = 32MB > checkpoint_segments = 192 > checkpoint_completion_target = 0.8 > > > Results > > Clients | 9.3 tps 32 cores | 9.3 tps 60 cores > --------+------------------+----------------- > 6 | 70400 | 71028 > 12 | 98918 | 129140 > 24 | 230345 | 240631 > 48 | 324042 | 409510 > 96 | 346929 | 120464 > 192 | 312621 | 92663 > > So we have anti scaling with 60 cores as we increase the client connections. > Ouch! A level of urgency led to trying out Andres's 'rwlock' 9.4 branch [1] > - cherry picking the last 5 commits into 9.4 branch and building a package > from that and retesting: > > Clients | 9.4 tps 60 cores (rwlock) > --------+-------------------------- > 6 | 70189 > 12 | 128894 > 24 | 233542 > 48 | 422754 > 96 | 590796 > 192 | 630672 > > Wow - that is more like it! Andres that is some nice work, we definitely owe > you some beers for that :-) I am aware that I need to retest with an > unpatched 9.4 src - as it is not clear from this data how much is due to > Andres's patches and how much to the steady stream of 9.4 development. I'll > post an update on that later, but figured this was interesting enough to > note for now. Cool. That's what I like (and expect) to see :). I don't think unpatched 9.4 will show significantly different results than 9.3, but it'd be good to validate that. If you do so, could you post the results in the -hackers thread I just CCed you on? That'll help the work to get into 9.5. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services