Hi List,
I've got a 5 OSD node cluster running hammer. All of the OSD servers are identical but one has about 3-4x higher load than the others and the OSD's in this node are reporting high apply latency.
The cause of the load appears to be the OSD processes. About half of the OSD processes are using between 100-185% CPU putting keeping the proc pegged around 85% utilization overall. In comparison others servers in the cluster are sitting around 30% CPU utilization and are report ~1.5ms of apply latency.
A few days ago I restarted the OSD processes and the problem went away but now three days later it has returned. I don't see anything in the logs and there's no iowait on the disks.
Anyone have any ideas on how I can troubleshoot this further?
Thank You,
John
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com