On Mon, Jul 4, 2016 at 12:38 PM, Dzianis Kahanovich <mahatma@xxxxxxx> wrote: > Gregory Farnum пишет: >> On Thu, Jun 30, 2016 at 1:03 PM, Dzianis Kahanovich <mahatma@xxxxxxx> wrote: >>> Upgraded infernalis->jewel (git, Gentoo). Upgrade passed over global >>> stop/restart everything oneshot. >>> >>> Infernalis: e5165: 1/1/1 up {0=c=up:active}, 1 up:standby-replay, 1 up:standby >>> >>> Now after upgrade start and next mon restart, active monitor falls with >>> "assert(info.state == MDSMap::STATE_STANDBY)" (even without running mds) . Fixed: >>> >>> --- a/src/mon/MDSMonitor.cc 2016-06-27 21:26:26.000000000 +0300 >>> +++ b/src/mon/MDSMonitor.cc 2016-06-28 10:44:32.000000000 +0300 >>> @@ -2793,7 +2793,11 @@ bool MDSMonitor::maybe_promote_standby(s >>> for (const auto &j : pending_fsmap.standby_daemons) { >>> const auto &gid = j.first; >>> const auto &info = j.second; >>> - assert(info.state == MDSMap::STATE_STANDBY); >>> +// assert(info.state == MDSMap::STATE_STANDBY); >>> + if (info.state != MDSMap::STATE_STANDBY) { >>> + dout(0) << "gid " << gid << " ex-assert(info.state == >>> MDSMap::STATE_STANDBY) " << do_propose << dendl; >>> + return do_propose; >>> + } >>> >>> if (!info.standby_replay) { >>> continue; >>> >>> >>> Now: e5442: 1/1/1 up {0=a=up:active}, 1 up:standby >>> - but really there are 3 mds (active, replay, standby). >>> >>> # ceph mds dump >>> dumped fsmap epoch 5442 >>> fs_name cephfs >>> epoch 5441 >>> flags 0 >>> created 2016-04-10 23:44:38.858769 >>> modified 2016-06-27 23:08:26.211880 >>> tableserver 0 >>> root 0 >>> session_timeout 60 >>> session_autoclose 300 >>> max_file_size 1099511627776 >>> last_failure 5239 >>> last_failure_osd_epoch 18473 >>> compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable >>> ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses >>> versioned encoding,6=dirfrag is stored in omap,8=no anchor table} >>> max_mds 1 >>> in 0 >>> up {0=3104110} >>> failed >>> damaged >>> stopped >>> data_pools 5 >>> metadata_pool 6 >>> inline_data disabled >>> 3104110: 10.227.227.103:6800/14627 'a' mds.0.5436 up:active seq 30 >>> 3084126: 10.227.227.104:6800/24069 'c' mds.0.0 up:standby-replay seq 1 >>> >>> >>> If standby-replay false - all OK: 1/1/1 up {0=a=up:active}, 2 up:standby >>> >>> How to fix this 3-mds behaviour? >> >> Ah, you hit a known bug with that assert. I thought the fix was >> already in the latest point release; are you behind? >> -Greg >> > > Cheked in logs - observed in version 10.2.2-45-g9aafefe > (9aafefeab6b0f01d7467f70cb2f1b16ae88340e8) - 27.06 git jewel branch latest. > Where is fixed point? Ah, I see another report of this as well. Created a ticket: http://tracker.ceph.com/issues/16592. -Greg _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com