On Thu, Jun 30, 2016 at 1:03 PM, Dzianis Kahanovich <mahatma@xxxxxxx> wrote: > Upgraded infernalis->jewel (git, Gentoo). Upgrade passed over global > stop/restart everything oneshot. > > Infernalis: e5165: 1/1/1 up {0=c=up:active}, 1 up:standby-replay, 1 up:standby > > Now after upgrade start and next mon restart, active monitor falls with > "assert(info.state == MDSMap::STATE_STANDBY)" (even without running mds) . Fixed: > > --- a/src/mon/MDSMonitor.cc 2016-06-27 21:26:26.000000000 +0300 > +++ b/src/mon/MDSMonitor.cc 2016-06-28 10:44:32.000000000 +0300 > @@ -2793,7 +2793,11 @@ bool MDSMonitor::maybe_promote_standby(s > for (const auto &j : pending_fsmap.standby_daemons) { > const auto &gid = j.first; > const auto &info = j.second; > - assert(info.state == MDSMap::STATE_STANDBY); > +// assert(info.state == MDSMap::STATE_STANDBY); > + if (info.state != MDSMap::STATE_STANDBY) { > + dout(0) << "gid " << gid << " ex-assert(info.state == > MDSMap::STATE_STANDBY) " << do_propose << dendl; > + return do_propose; > + } > > if (!info.standby_replay) { > continue; > > > Now: e5442: 1/1/1 up {0=a=up:active}, 1 up:standby > - but really there are 3 mds (active, replay, standby). > > # ceph mds dump > dumped fsmap epoch 5442 > fs_name cephfs > epoch 5441 > flags 0 > created 2016-04-10 23:44:38.858769 > modified 2016-06-27 23:08:26.211880 > tableserver 0 > root 0 > session_timeout 60 > session_autoclose 300 > max_file_size 1099511627776 > last_failure 5239 > last_failure_osd_epoch 18473 > compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable > ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses > versioned encoding,6=dirfrag is stored in omap,8=no anchor table} > max_mds 1 > in 0 > up {0=3104110} > failed > damaged > stopped > data_pools 5 > metadata_pool 6 > inline_data disabled > 3104110: 10.227.227.103:6800/14627 'a' mds.0.5436 up:active seq 30 > 3084126: 10.227.227.104:6800/24069 'c' mds.0.0 up:standby-replay seq 1 > > > If standby-replay false - all OK: 1/1/1 up {0=a=up:active}, 2 up:standby > > How to fix this 3-mds behaviour? Ah, you hit a known bug with that assert. I thought the fix was already in the latest point release; are you behind? -Greg _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com