Re: How does monitor know OSD is dead?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jun 29, 2019 at 6:51 PM Bryan Henderson <bryanh@xxxxxxxxxxxxxxxx> wrote:
> The reason it is so long is that you don't want to move data
> around unnecessarily if the osd is just being rebooted/restarted.

I think you're confusing down with out.  When an OSD is out, Ceph
backfills.  While it is merely down, Ceph hopes that it will come back.
But it will direct I/O to other redundant OSDs instead of a down one.

Going down leads to going out, and I believe that is the 600 seconds you
mention - the time between when the OSD is marked down and when Ceph marks it
out (if all other conditions permit).

There is a pretty good explanation of how OSDs get marked down, which is
pretty complicated, at

  http://docs.ceph.com/docs/master/rados/configuration/mon-osd-interaction/

It just doesn't seem to match the implementation.

--
Bryan Henderson                                   San Jose, California

I mixed up my terminology, the first line should have read:
" I'm not sure why the monitor did not mark it _out_ after 600 seconds (default) " 

The "down timeout" I mention is the "mon osd down out interval".

The rest of what I wrote is correct. Just to make sure I don't confuse anyone else.
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux