Re: 60 core performance with 9.3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 27/06/14 14:01, Scott Marlowe wrote:
On Thu, Jun 26, 2014 at 5:49 PM, Mark Kirkwood
<mark.kirkwood@xxxxxxxxxxxxxxx> wrote:
I have a nice toy to play with: Dell R920 with 60 cores and 1TB ram [1].

The context is the current machine in use by the customer is a 32 core one,
and due to growth we are looking at something larger (hence 60 cores).

Some initial tests show similar pgbench read only performance to what Robert
found here
http://rhaas.blogspot.co.nz/2012/04/did-i-say-32-cores-how-about-64.html
(actually a bit quicker around 400000 tps).

However doing a mixed read-write workload is getting results the same or
only marginally quicker than the 32 core machine - particularly at higher
number of clients (e.g 200 - 500). I have yet to break out the perf toolset,
but I'm wondering if any folk has compared 32 and 60 (or 64) core read write
pgbench performance?

My guess is that the read only test is CPU / memory bandwidth limited,
but the mixed test is IO bound.

What's your iostat / vmstat / iotop etc look like when you're doing
both read only and read/write mixed?



That was what I would have thought too, but it does not appear to be the case, here is a typical iostat:

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 nvme0n1 0.00 0.00 0.00 4448.00 0.00 41.47 19.10 0.14 0.03 0.00 0.03 0.03 14.40 nvme1n1 0.00 0.00 0.00 4448.00 0.00 41.47 19.10 0.15 0.03 0.00 0.03 0.03 15.20 nvme2n1 0.00 0.00 0.00 4549.00 0.00 42.20 19.00 0.15 0.03 0.00 0.03 0.03 15.20 nvme3n1 0.00 0.00 0.00 4548.00 0.00 42.19 19.00 0.16 0.04 0.00 0.04 0.04 16.00 dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md0 0.00 0.00 0.00 17961.00 0.00 83.67 9.54 0.00 0.00 0.00 0.00 0.00 0.00 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00


My feeling is spinlock or similar, 'perf top' shows

kernel find_busiest_group
kernel _raw_spin_lock

as the top time users.



[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux