On Mon, Jan 16, 2017 at 7:15 PM, Alexey Sheplyakov <asheplyakov@xxxxxxxxxxxx> wrote: > Hi, > > Actually down/slow OSDs are detected by other OSDs. OSDs exchange > heartbeats every few seconds. > When an OSD haven't received a reply from its neighbor within the > grace period (default: 20 seconds), > the OSD (which has sent the heartbeat) considers its neighbor down and > reports this to a monitor. > After receiving 3 such messages in a row the monitor marks the OSD in > question as down. > > Also OSDs report their status directly to monitors every few (~2) > minutes (to avoid flooding monitors). > If the monitor haven't received any status reports from an OSD within > a grace period (~15 minutes) > the OSD in question is considered down. > > If all OSDs are down there is no OSD which could have reported its > neighbors are down. Thus monitor > will consider all OSDs as up until the OSD report grace period (~15 > minutes) expires. > > See http://docs.ceph.com/docs/giant/rados/configuration/mon-osd-interaction/#configuring-monitor-osd-interaction http://docs.ceph.com/docs/master/rados/configuration/mon-osd-interaction/#configuring-monitor-osd-interaction Giant is getting very old. > for more details. > > Best regards, > Alexey > > > On Sat, Jan 14, 2017 at 7:59 PM, Jin Cai <caijin.laurence@xxxxxxxxx> wrote: >> Hi all, >> I deployed a ceph cluster with jewel in four physical machines. >> Three physical machines were used for OSD and each of them had eight >> OSDs. The left one was used to serve as a monitor. >> At first, everything worked well. >> By the reason of test, I stopped all the OSD daemon and double checked >> no OSD process running. >> After that, I executed ceph -s and got the following output: >> osdmap e164: 24 osds: 7 up, 7 in >> No matter how much time elapsed, the output didn't change. >> >> The expected output should be: >> osdmap e164: 24 osds: 0 up, 0 in >> >> I think it is the matter of synchronisation between OSDs and monitor >> Would you like you explain this strange phenomenon for me. >> Thanks a bunch in advance. >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Cheers, Brad -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html