Re: Monitor Restart triggers half of our OSDs marked down

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 3, 2015 at 2:38 PM, Christian Eichelmann
<christian.eichelmann@xxxxxxxx> wrote:
> Hi all,
>
> during some failover tests and some configuration tests, we currently
> discover a strange phenomenon:
>
> Restarting one of our monitors (5 in sum) triggers about 300 of the
> following events:
>
> osd.669 10.76.28.58:6935/149172 failed (20 reports from 20 peers after
> 22.005858 >= grace 20.000000)
>
> The osds come back up shortly after the have been marked down. What I
> don't understand is: How can a restart of one monitor prevent the osds
> from talking to each other and marking them down?
>
> FYI:
> We are currently using the following settings:
> mon osd adjust hearbeat grace = false
> mon osd min down reporters = 20
> mon osd adjust down out interval = false
>
> Regards,
> Christian


Can confirm simular behavior but in less excessive sizes: leader mon
restart may trigger small number of wrong markings as down or pg
rebalance, preconditions are very uncertain.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux