Re: Standby-replay mds: 10.2.2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 14, 2016 at 12:46 AM, Goncalo Borges
<goncalo.borges@xxxxxxxxxxxxx> wrote:
> Hi Greg, Jonh, Zheng, CephFSers
>
> Maybe a simple question but I think it is better to ask first than to
> complain after.
>
> We are currently undergoing an infrastructure migration. One of the first
> machines to go through this migration process is our standby-replay mds. We
> are running 10.2.2. My plan is to:

Is the 10.2.2 here a typo?  What's the current version that you're
upgrading to 10.2.2 from?

> - Shutdown the standby-replay mds
> - Re install it in 10.2.2 in a different host, reusing the same IP, keys and
> configurations.

Any particular reason for keeping the same IP?  In general you don't
need to worry about that at all: I'd usually just delete the old MDS
entirely and create a new one, only keeping the ceph.conf section that
configures your standby replay options.

> - Start the mds service
>
> I wasn't thinking this was problematic until I read:
> http://tracker.ceph.com/issues/17466
>
> The issue mentioned above was started when the site admin added a new mds.
> He also did an (unintended) upgrade of the mds(es) from 10.2.1 to 10.2.3 but
> I am not sure if this is the reason of the problem. His mons started to fail
> because they got an invalid fscid, and the reason is some incoherent
> ordering of rank and fscid between the constructor and the struct.

The actual issue (we think) was that the message decode was getting
junk value for fscid when the beacon was sent by an older MDS due to a
missing default initialisation, and then that the MDSMonitor was
failing to validate that.

This code path was only hit in cases where standby_for_rank was set,
so for that particular symptom you should be okay if you just don't
set standby_for_rank at all (if you have one MDS, your standby replay
daemon will always pick up that rank).

John

> I just want to be sure that I won't hit a similar issue:
> - In what exact circumstances is this problem triggered?
> - Is it triggered when you add a brand new standby-replay mds (new IP, new
> key)? I am hopping that in my case, I shouldn't be affected.
>
> TIA
> Goncalo
>
>
>
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux