Re: How to monitor health and connectivity of OSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 8, 2016 at 3:25 AM, Mariusz Gronczewski
<mariusz.gronczewski@xxxxxxxxxxxx> wrote:
> Is there an equivalent of 'ceph health' but for OSD ?
>
> Like warning about slowness or troubles with communication between OSDs?
>
> I've spent good amount of time debugging what looked like stuck pgs
> only but it turned out to be bad NIC and it was only apparent once I
> saw some OSD logs like
>
> 2016-02-08 03:42:27.810289 7fc9b8bff700 -1 osd.9 146800 heartbeat_check: no reply from osd.14 ever on either front or back, first ping sent 2016-02-08 03:39:24.860852 (cutoff 2016-02-08 03:39:27.810288)
> 2016-02-08 03:42:27.810297 7fc9b8bff700 -1 osd.9 146800 heartbeat_check: no reply from osd.15 ever on either front or back, first ping sent 2016-02-08 03:39:24.860852 (cutoff 2016-02-08 03:39:27.810288)
> 2016-02-08 03:42:28.311125 7fc9b8bff700 -1 osd.9 146800 heartbeat_check: no reply from osd.14 ever on either front or back, first ping sent 2016-02-08 03:39:24.860852 (cutoff 2016-02-08 03:39:28.311124)
>
> (turned out to be bad nic, fuck emulex)
>
> is there anything that could dump things like "failed heartbeats in
> last 10 minutes"  or similiar stats ?

I don't think that's exposed anywhere — if it happens enough then the
OSD will get killed. We could maybe add some tracking structures and
an admin socket command to dump them from the OSD; you should create a
feature request at tracker.ceph.com. :)
-Greg

>
> --
> Mariusz Gronczewski, Administrator
>
> Efigence S. A.
> ul. Wołoska 9a, 02-583 Warszawa
> T: [+48] 22 380 13 13
> F: [+48] 22 380 13 14
> E: mariusz.gronczewski@xxxxxxxxxxxx
> <mailto:mariusz.gronczewski@xxxxxxxxxxxx>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux