Re: MDS crash when client goes to sleep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 24, 2014 at 6:26 PM, hjcho616 <hjcho616@xxxxxxxxx> wrote:
> I tried the patch twice.  First time, it worked.  There was no issue.
> Connected back to MDS and was happily running.  All three MDS demons were
> running ok.
>
> Second time though... all three demons were alive.  Health was reported OK.
> However client does not connect to MDS.  MDS demon gets following messages
> over and over and over again.  192.168.1.30 is one of the OSD.
> 2014-03-24 20:20:51.722367 7f400c735700  0 cephx: verify_reply couldn't
> decrypt with error: error decoding block for decryption
> 2014-03-24 20:20:51.722392 7f400c735700  0 -- 192.168.1.20:6803/21678 >>
> 192.168.1.30:6806/3796 pipe(0x2be3b80 sd=20 :56656 s=1 pgs=0 cs=0 l=1
> c=0x2bd6840).failed verifying authorize reply

This sounds different than the scenario you initially described, with
a client going to sleep. Exactly what are you doing?

>
> When I restart the MDS (not OSDs) when I do ceph health detail I did see a
> mds degraded message with a replay.  I restarted OSDs again and OSDs and it
> was ok.  Is there something I can do to prevent this?

That sounds normal -- the MDS has to replay its journal when it
restarts. It shouldn't take too long, but restarting OSDs definitely
won't help since the MDS is trying to read data off of them.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux