Another strange thing I'm seeing is that two of the nodes in the cluster have some OSD's with almost no activity. If I watch top long enough I'll eventually see cpu utilization on these osds but for the most part they sit a 0% cpu utilization. I'm not sure if this is expected behavior or not though. I have another cluster running the same version of ceph that has the same symptom but the osds in our jewel cluster always show activity.
|
|||
|
|||
|
|||
|
On Mon, Dec 18, 2017 at 11:51 AM, John Petrini <jpetrini@xxxxxxxxxxxx> wrote:
Hi David,Thanks for the info. The controller in the server (perc h730) was just replaced and the battery is at full health. Prior to replacing the controller I was seeing very high iowait when running iostat but I no longer see that behavior - just apply latency when running ceph osd perf. Since there's no iowait it makes me believe that the latency is not being introduced by the hardware; though I'm not ruling it out completely. I'd like to know what I can do to get a better understanding of what the OSD processes are so busy doing because they are working much harder on this server than the others.
On Thu, Dec 14, 2017 at 11:33 AM, David Turner <drakonstein@xxxxxxxxx> wrote:We show high disk latencies on a node when the controller's cache battery dies. This is assuming that you're using a controller with cache enabled for your disks. In any case, I would look at the hardware on the server.On Thu, Dec 14, 2017 at 10:15 AM John Petrini <jpetrini@xxxxxxxxxxxx> wrote:______________________________Anyone have any ideas on this?_________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com