Hi Dan, We ended up upgrading all mons+mgrs to 14.2.9 and the message stopped and the PG stats reappeared, as expected. Marcello started the OSD restarts this morning. I think it would have been much less stressful to get the cluster onto 12.2.13 before the nautilus upgrade, and much easier to get help should anything go wrong. Something to bear in mind for the future. Cheers, Tom > -----Original Message----- > From: Dan van der Ster <dan@xxxxxxxxxxxxxx> > Sent: 18 May 2020 10:18 > To: Byrne, Thomas (STFC,RAL,SC) <tom.byrne@xxxxxxxxxx> > Cc: ceph-users@xxxxxxx; Armand Pilon, Marcello (STFC,RAL,SC) > <marcello.armand-pilon@xxxxxxxxxx> > Subject: Re: Luminous to Nautilus mon upgrade oddity - failed to > decode mgrstat state; luminous dev version? buffer::end_of_buffer > > Hi Tom, > > Did you get past this? It looks like the mon is confused how to decode because > of your non-standard release. > (So I imaging that running all 14.2.9 mons would get past it, but if you're being > cautious this should be reproduceable on your test cluster). > > -- Dan > > > On Wed, May 13, 2020 at 12:07 PM Thomas Byrne - UKRI STFC > <tom.byrne@xxxxxxxxxx> wrote: > > > > Hi all, > > > > We're upgrading a cluster from luminous to nautilus. The monitors and > managers are running a non-release version of luminous (12.2.12-642- > g5ff3e8e) and we're upgrading them to 14.2.9. > > > > We've upgraded one monitor and it's happily in quorum as a peon. > However, when a ceph status hits the nautilus mon it has trouble talking to the > manager apparently, and it returns a status output with no pg stats and > garbage usage numbers. From the mon log: > > > > 2020-05-13 10:41:43.121 7fa1e6fdf700 0 mon.ceph-mon5@4(peon) e25 > > handle_command mon_command({"prefix": "status"} v 0) v1 > > 2020-05-13 10:41:43.121 7fa1e6fdf700 0 log_channel(audit) log [DBG] : > > from='client.? v1:130.246.x.x:0/3261311028' entity='client.admin' > > cmd=[{"prefix": "status"}]: dispatch > > 2020-05-13 10:41:43.443 7fa1e6fdf700 -1 mon.ceph- > mon5@4(peon).mgrstat > > failed to decode mgrstat state; luminous dev version? > > buffer::end_of_buffer > > 2020-05-13 10:41:44.397 7fa1e6fdf700 1 mon.ceph-mon5@4(peon) e25 > > dropping unexpected mon_health( e 0 r 0 ) v1 > > > > Is this expected for a luminous to nautilus upgrade or could this be due to > the odd luminous version we are running, or something else entirely? > > > > Cheers, > > Tom > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an > > email to ceph-users-leave@xxxxxxx This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses. Opinions, conclusions or other information in this message and attachments that are not related directly to UKRI business are solely those of the author and do not represent the views of UKRI. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx