Re: OSD issue: unable to obtain rotating service keys

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On Wed, 1 Jun 2016 20:21:29 -0500 Jeffrey McDonald wrote:

> Thanks Christian,
> I did google the error and I actually found this link.   (Of course, I
> wouldn't want to waste others' time as either.)  It appears to me to be a
> different issue than what I see because the OSDs actually fail to start.
>   Anyways,  after a few minutes I restarted the OSDs and they started
> normally.

It might well be the same issue.
As in, did you monitor the MONs at that time?
If they were too busy to respond to the OSDs, this could explain what you
were seeing.

Christian
> I have nine similar OSD nodes and updates on the others didn't experience
> this issue.
> I'll update the list if there are any additional issues related to this.
> Best Regards,
> Jeff
> 
> On Wed, Jun 1, 2016 at 7:15 PM, Christian Balzer <chibi@xxxxxxx> wrote:
> 
> >
> > Hello,
> >
> > On Wed, 1 Jun 2016 12:31:41 -0500 Jeffrey McDonald wrote:
> >
> > > Hi,
> > >
> > > I just performed a minor ceph upgrade on my ubuntu 14.04 cluster from
> > > ceph version to0.94.6-1trusty  to  0.94.7-1trusty.   Upon restarting
> > > the OSDs, I receive the error message:
> > >
> > Unfortunately (despite what common sense would suggest) there are no
> > minor upgrades in Ceph.
> > As in, a lot of the point releases in stable, LTS Ceph versions have
> > caused more havoc than major version upgrades, often due to backported
> > features. Most notably in recent history the 0.94.6 release that had
> > massive data loss potential with cache tiers.
> >
> > What you're seeing looks a lot like what was discussed and solved here,
> > it's always a good idea to google for error messages you see:
> >
> > http://www.spinics.net/lists/ceph-devel/msg30450.html
> >
> > Regards,
> >
> > Christian
> >
> > > 2016-06-01 12:17:49.219512 7f64a70ea8c0  0 monclient:
> > > wait_auth_rotating timed out after 30
> > > 2016-06-01 12:17:49.219605 7f64a70ea8c0 -1 osd.177 282877 unable to
> > > obtain rotating service keys; retrying
> > > 2016-06-01 12:18:19.219740 7f64a70ea8c0  0 monclient:
> > > wait_auth_rotating timed out after 30
> > > 2016-06-01 12:18:19.219782 7f64a70ea8c0 -1 osd.177 282877 unable to
> > > obtain rotating service keys; retrying
> > > 2016-06-01 12:18:49.219869 7f64a70ea8c0  0 monclient:
> > > wait_auth_rotating timed out after 30
> > > 2016-06-01 12:18:49.219908 7f64a70ea8c0 -1 osd.177 282877 unable to
> > > obtain rotating service keys; retrying
> > >
> > > eventually the OSD fails to start.    Oddly, this affects only some
> > > of the OSDs on this host.
> > > On the mon host, there is a warning:
> > >
> > > 2016-06-01 11:22:52.171152 osd.177 10.31.0.71:6842/10245 433 :
> > > cluster [WRN] failed to encode map e282667 with expected crc
> > > 2016-06-01 11:22:55.399487 osd.177 10.31.0.71:6842/10245 434 :
> > > cluster [WRN] failed to encode map e282668 with expected crc
> > > 2016-06-01 11:22:55.572238 osd.177 10.31.0.71:6842/10245 435 :
> > > cluster [WRN] failed to encode map e282668 with expected crc
> > > 2016-06-01 11:22:59.571924 osd.177 10.31.0.71:6842/10245 436 :
> > > cluster [WRN] failed to encode map e282669 with expected crc
> > > 2016-06-01 11:22:59.665925 osd.177 10.31.0.71:6842/10245 437 :
> > > cluster [WRN] failed to encode map e282669 with expected crc
> > > 2016-06-01 11:22:59.671475 osd.177 10.31.0.71:6842/10245 438 :
> > > cluster [WRN] failed to encode map e282669 with expected crc
> > > 2016-06-01 11:22:59.780646 osd.177 10.31.0.71:6842/10245 439 :
> > > cluster [WRN] failed to encode map e282669 with expected crc
> > > 2016-06-01 11:23:00.778250 osd.177 10.31.0.71:6842/10245 440 :
> > > cluster [WRN] failed to encode map e282670 with expected crc
> > > 2016-06-01 11:23:02.075996 osd.177 10.31.0.71:6842/10245 441 :
> > > cluster [WRN] failed to encode map e282671 with expected crc
> > > 2016-06-01 11:23:03.110117 osd.177 10.31.0.71:6842/10245 442 :
> > > cluster [WRN] failed to encode map e282672 with expected crc
> > > 2016-06-01 11:23:03.255597 osd.177 10.31.0.71:6842/10245 443 :
> > > cluster [WRN] failed to encode map e282672 with expected crc
> > > 2016-06-01 11:23:03.258223 osd.177 10.31.0.71:6842/10245 444 :
> > > cluster [WRN] failed to encode map e282672 with expected crc
> > > 2016-06-01 11:23:05.287753 osd.177 10.31.0.71:6842/10245 445 :
> > > cluster [WRN] failed to encode map e282673 with expected crc
> > >
> > >
> > > How do I clear these up after the upgrade?    All of the filesystems
> > > on the OSDs are mounted and the keyrings are there......
> > >
> > > Thanks,
> > > Jeff
> > >
> > >
> >
> >
> > --
> > Christian Balzer        Network/Systems Engineer
> > chibi@xxxxxxx           Global OnLine Japan/Rakuten Communications
> > http://www.gol.com/
> >
> 
> 
> 


-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux