Re: Deadly slow Ceph cluster revisited

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 07/17/2015 09:55 AM, J David wrote:
On Fri, Jul 17, 2015 at 10:21 AM, Mark Nelson <mnelson@xxxxxxxxxx> wrote:
rados -p <pool> 30 bench write

just to see how it handles 4MB object writes.

Here's that, from the VM host:

  Total time run:         52.062639
Total writes made:      66
Write size:             4194304
Bandwidth (MB/sec):     5.071

Yep, awfully slow!


Stddev Bandwidth:       11.6312
Max bandwidth (MB/sec): 80
Min bandwidth (MB/sec): 0
Average Latency:        12.436

12 second average latency! Yikes. That does either sound like network or one of the disks is very slow. Do you see faster performance during the first second or two of the rados bench run? That might indicate that you are backing up on a specific OSD.

Stddev Latency:         13.6272
Max latency:            51.6924
Min latency:            0.073353

Unfortunately I don't know much about how to parse this (other than
5MB/sec writes does match up with our best-case performance in the VM
guest).

If rados bench is
also terribly slow, then you might want to start looking for evidence of IO
getting hung up on a specific disk or node.

Thusfar, no evidence of that has presented itself.  iostat looks good
on every drive and the nodes are all equally loaded.

Ok. Maybe try some iperf tests between the different OSD nodes in your cluster and also the client to the OSDs.


Thanks!

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux