Re: mds standby + standby-reply upgrade

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 4, 2016 at 12:38 PM, Dzianis Kahanovich <mahatma@xxxxxxx> wrote:
> Gregory Farnum пишет:
>> On Thu, Jun 30, 2016 at 1:03 PM, Dzianis Kahanovich <mahatma@xxxxxxx> wrote:
>>> Upgraded infernalis->jewel (git, Gentoo). Upgrade passed over global
>>> stop/restart everything oneshot.
>>>
>>> Infernalis: e5165: 1/1/1 up {0=c=up:active}, 1 up:standby-replay, 1 up:standby
>>>
>>> Now after upgrade start and next mon restart, active monitor falls with
>>> "assert(info.state == MDSMap::STATE_STANDBY)" (even without running mds) . Fixed:
>>>
>>> --- a/src/mon/MDSMonitor.cc     2016-06-27 21:26:26.000000000 +0300
>>> +++ b/src/mon/MDSMonitor.cc     2016-06-28 10:44:32.000000000 +0300
>>> @@ -2793,7 +2793,11 @@ bool MDSMonitor::maybe_promote_standby(s
>>>      for (const auto &j : pending_fsmap.standby_daemons) {
>>>        const auto &gid = j.first;
>>>        const auto &info = j.second;
>>> -      assert(info.state == MDSMap::STATE_STANDBY);
>>> +//      assert(info.state == MDSMap::STATE_STANDBY);
>>> +      if (info.state != MDSMap::STATE_STANDBY) {
>>> +        dout(0) << "gid " << gid << " ex-assert(info.state ==
>>> MDSMap::STATE_STANDBY) " << do_propose << dendl;
>>> +       return do_propose;
>>> +      }
>>>
>>>        if (!info.standby_replay) {
>>>          continue;
>>>
>>>
>>> Now: e5442: 1/1/1 up {0=a=up:active}, 1 up:standby
>>> - but really there are 3 mds (active, replay, standby).
>>>
>>> # ceph mds dump
>>> dumped fsmap epoch 5442
>>> fs_name cephfs
>>> epoch   5441
>>> flags   0
>>> created 2016-04-10 23:44:38.858769
>>> modified        2016-06-27 23:08:26.211880
>>> tableserver     0
>>> root    0
>>> session_timeout 60
>>> session_autoclose       300
>>> max_file_size   1099511627776
>>> last_failure    5239
>>> last_failure_osd_epoch  18473
>>> compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable
>>> ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses
>>> versioned encoding,6=dirfrag is stored in omap,8=no anchor table}
>>> max_mds 1
>>> in      0
>>> up      {0=3104110}
>>> failed
>>> damaged
>>> stopped
>>> data_pools      5
>>> metadata_pool   6
>>> inline_data     disabled
>>> 3104110:        10.227.227.103:6800/14627 'a' mds.0.5436 up:active seq 30
>>> 3084126:        10.227.227.104:6800/24069 'c' mds.0.0 up:standby-replay seq 1
>>>
>>>
>>> If standby-replay false - all OK: 1/1/1 up {0=a=up:active}, 2 up:standby
>>>
>>> How to fix this 3-mds behaviour?
>>
>> Ah, you hit a known bug with that assert. I thought the fix was
>> already in the latest point release; are you behind?
>> -Greg
>>
>
> Cheked in logs - observed in version 10.2.2-45-g9aafefe
> (9aafefeab6b0f01d7467f70cb2f1b16ae88340e8) - 27.06 git jewel branch latest.
> Where is fixed point?

Ah, I see another report of this as well. Created a ticket:
http://tracker.ceph.com/issues/16592.
-Greg
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux