metrics to monitor for performance bottlenecks?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey folks,

I have a ceph cluster supporting about 500 VMs using RBD. I am seeing around 10-12k IOPS cluster-wide and IO wait time creeping up within the VMs. 

My suspicion is that I am pushing my ceph cluster to its limit in terms of overall throughput. I am curious if there are metrics that can be passively collected either in VMs or on ceph nodes to reveal the cluster is at its peak. IO wait time inside of VMs might be a good one, but I am interested in monitoring the ceph nodes directly as well. Ideally I want to track those metrics, perform some trending analysis, and provision capacity (not space, but throughput) before VM performance is impacted.

Any thoughts or experience on this matter?

Thanks.
-Simon
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux