Re: mds standby + standby-reply upgrade

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Gregory Farnum пишет:
> On Thu, Jun 30, 2016 at 1:03 PM, Dzianis Kahanovich <mahatma@xxxxxxx> wrote:
>> Upgraded infernalis->jewel (git, Gentoo). Upgrade passed over global
>> stop/restart everything oneshot.
>>
>> Infernalis: e5165: 1/1/1 up {0=c=up:active}, 1 up:standby-replay, 1 up:standby
>>
>> Now after upgrade start and next mon restart, active monitor falls with
>> "assert(info.state == MDSMap::STATE_STANDBY)" (even without running mds) . Fixed:
>>
>> --- a/src/mon/MDSMonitor.cc     2016-06-27 21:26:26.000000000 +0300
>> +++ b/src/mon/MDSMonitor.cc     2016-06-28 10:44:32.000000000 +0300
>> @@ -2793,7 +2793,11 @@ bool MDSMonitor::maybe_promote_standby(s
>>      for (const auto &j : pending_fsmap.standby_daemons) {
>>        const auto &gid = j.first;
>>        const auto &info = j.second;
>> -      assert(info.state == MDSMap::STATE_STANDBY);
>> +//      assert(info.state == MDSMap::STATE_STANDBY);
>> +      if (info.state != MDSMap::STATE_STANDBY) {
>> +        dout(0) << "gid " << gid << " ex-assert(info.state ==
>> MDSMap::STATE_STANDBY) " << do_propose << dendl;
>> +       return do_propose;
>> +      }
>>
>>        if (!info.standby_replay) {
>>          continue;
>>
>>
>> Now: e5442: 1/1/1 up {0=a=up:active}, 1 up:standby
>> - but really there are 3 mds (active, replay, standby).
>>
>> # ceph mds dump
>> dumped fsmap epoch 5442
>> fs_name cephfs
>> epoch   5441
>> flags   0
>> created 2016-04-10 23:44:38.858769
>> modified        2016-06-27 23:08:26.211880
>> tableserver     0
>> root    0
>> session_timeout 60
>> session_autoclose       300
>> max_file_size   1099511627776
>> last_failure    5239
>> last_failure_osd_epoch  18473
>> compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable
>> ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses
>> versioned encoding,6=dirfrag is stored in omap,8=no anchor table}
>> max_mds 1
>> in      0
>> up      {0=3104110}
>> failed
>> damaged
>> stopped
>> data_pools      5
>> metadata_pool   6
>> inline_data     disabled
>> 3104110:        10.227.227.103:6800/14627 'a' mds.0.5436 up:active seq 30
>> 3084126:        10.227.227.104:6800/24069 'c' mds.0.0 up:standby-replay seq 1
>>
>>
>> If standby-replay false - all OK: 1/1/1 up {0=a=up:active}, 2 up:standby
>>
>> How to fix this 3-mds behaviour?
> 
> Ah, you hit a known bug with that assert. I thought the fix was
> already in the latest point release; are you behind?
> -Greg
> 

Cheked in logs - observed in version 10.2.2-45-g9aafefe
(9aafefeab6b0f01d7467f70cb2f1b16ae88340e8) - 27.06 git jewel branch latest.
Where is fixed point?

-- 
WBR, Dzianis Kahanovich AKA Denis Kaganovich, http://mahatma.bspu.by/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux