On Thu, 5 Feb 2015, Dan van der Ster wrote: > On Thu, Feb 5, 2015 at 9:54 AM, Sage Weil <sage@xxxxxxxxxxxx> wrote: > > On Thu, 5 Feb 2015, Dan van der Ster wrote: > >> Hi, > >> We also have seen this once after upgrading to 0.80.8 (from dumpling). > >> Last week we had a network outage which marked out around 1/3rd of our > >> OSDs. The outage lasted less than a minute -- all the OSDs were > >> brought up once the network was restored. > >> > >> Then 30 minutes later I restarted one monitor to roll out a small > >> config change (changing leveldb log path). Surprisingly that resulted > >> in many OSDs (but seemingly fewer than before) being marked out again > >> then quickly marked in again. > > > > Did the 'wrongly marked down' messages appear in ceph.log? > > > >> I only have the lowest level logs from this incident -- but I think > >> it's easily reproducable. > > > > Logs with debug ms = 1 and debug mon = 20 would be best if someone is able > > to reproduce this. > > I can reproduce using iptables to kill the network for 60s on one of > our OSD hosts. Here are the logs with ms=1 mon=20: > https://www.dropbox.com/s/vdzl005n2qiwlee/ceph.log.gz?dl=0 > https://www.dropbox.com/s/to26i8k11vp9t8k/ceph-mon.0.log.gz?dl=0 > https://www.dropbox.com/s/j5e3rujs7qjouzh/ceph-mon.2.log.gz?dl=0 > > The badness happens after mon.2 is restarted: > > 2015-02-05 10:54:31.456887 mon.0 128.142.35.220:6789/0 602775 : [INF] > osd.20 128.142.23.53:6850/57083 failed (3 reports from 3 peers after > 41.616656 >= grace 38.742061) > 2015-02-05 10:54:31.457036 mon.0 128.142.35.220:6789/0 602776 : [INF] > osd.21 128.142.23.53:6870/50055 failed (5 reports from 4 peers after > 39.614710 >= grace 39.553689) > 2015-02-05 10:54:31.457092 mon.0 128.142.35.220:6789/0 602777 : [INF] > osd.22 128.142.23.53:6831/45065 failed (5 reports from 4 peers after > 45.615582 >= grace 42.927456) Yep, it's a silly bug and I'm surprised we haven't noticed until now! http://tracker.ceph.com/issues/10762 https://github.com/ceph/ceph/pull/3631 Thanks! sage _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com