Re: Upgraded Bobtail to Cuttlefish and unable to mount cephfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Can you start up your mds with "dedug mds = 20" and "debug ms = 20"?
The "failed to decode message" line is suspicious but there's not
enough context here for me to be sure, and my pattern-matching isn't
reminding me of any serious bugs.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com

On Thu, Aug 29, 2013 at 3:10 AM, Serge Slipchenko
<serge.slipchenko@xxxxxxxxxxxx> wrote:
> Hi,
>
> I upgraded Ceph from Bobtail to Cuttlefish and everything seemed good.
> Then I started to write to cephfs, but at some moment write stalled.
> After that I'm not able to mount either with kernel driver, or with
> custom utility.
>
> ceph -s shows that everything is good.
>
> health HEALTH_OK
> monmap e2: 2 mons at {m01=5.9.118.83:6789/0,m02=5.9.122.115:6789/0},
> election epoch 1320, quorum 0,1 m01,m02
> osdmap e3967: 16 osds: 16 up, 16 in
> pgmap v1315932: 256 pgs: 255 active+clean, 1 active+clean+scrubbing; 215
> GB data, 448 GB used, 38441 GB / 40971 GB avail; 37585KB/s rd, 1op/s
> mdsmap e774: 1/1/1 up {0=m02=up:active}, 1 up:standby
>
> But in the mds.a log I see the following messages:
>
> 2013-08-29 10:06:34.371166 7f49e68aa700  0 -- 5.9.122.115:6807/1077 >>
> 91.193.166.194:0/2272475298 pipe(0x8de3780 sd=74 :6807 s=0 pgs=0 cs=0
> l=0).accept peer addr is really 91.193.166.194:0/2272475298 (socket is
> 91.193.166.194:56649/0)
> 2013-08-29 10:07:38.454659 7f49e68aa700  0 -- 5.9.122.115:6807/1077 >>
> 91.193.166.194:0/2272475298 pipe(0x8de3780 sd=74 :6807 s=2 pgs=2 cs=1
> l=0).fault, server, going to standby
> 2013-08-29 10:23:06.898089 7f49e60a2700  0 -- 5.9.122.115:6807/1077 >>
> 91.193.166.194:0/3930317661 pipe(0x7442c000 sd=78 :6807 s=0 pgs=0 cs=0
> l=0).accept peer addr is really 91.193.166.194:0/3930317661 (socket is
> 91.193.166.194:56272/0)
> 2013-08-29 10:24:07.384136 7f49e60a2700  0 -- 5.9.122.115:6807/1077 >>
> 91.193.166.194:0/3930317661 pipe(0x7442c000 sd=78 :6807 s=2 pgs=2 cs=1
> l=0).fault, server, going to standby
> 2013-08-29 10:30:21.177807 7f49e5c9e700  0 -- 5.9.122.115:6807/1077 >>
> 91.193.166.194:0/1838286378 pipe(0x73bd8a00 sd=80 :6807 s=0 pgs=0 cs=0
> l=0).accept peer addr is really 91.193.166.194:0/1838286378 (socket is
> 91.193.166.194:59069/0)
> 2013-08-29 10:31:21.300004 7f49e5c9e700  0 -- 5.9.122.115:6807/1077 >>
> 91.193.166.194:0/1838286378 pipe(0x73bd8a00 sd=80 :6807 s=2 pgs=2 cs=1
> l=0).fault, server, going to standby
> 2013-08-29 11:17:17.331613 7f040de6b700  0 -- 5.9.122.115:6807/7622 >>
> 91.193.166.194:0/2689145238 pipe(0x13ea780 sd=34 :6807 s=2 pgs=2 cs=1
> l=0).fault with nothing to send, going to standby
> 2013-08-29 11:22:08.137711 7f0411897700  0 log [INF] : closing stale
> session client.76201 91.193.166.194:0/2689145238 after 304.270364
>
> And mds.b outputs a lot of:
>
> 2013-08-29 12:04:58.743938 7fa75604d700 -1 failed to decode message of
> type 23 v2: buffer::end_of_buffer
> 2013-08-29 12:04:58.743969 7fa75604d700  0 -- 5.9.122.115:6800/977 >>
> 144.76.13.103:0/925435369 pipe(0x524e780 sd=39 :6800 s=2 pgs=130763
> cs=12829 l=0).fault with nothing to send, going to standby
> 2013-08-29 12:04:58.744236 7fa755f4c700  0 -- 5.9.122.115:6800/977 >>
> 144.76.13.102:0/2955281877 pipe(0x524e500 sd=37 :6800 s=0 pgs=0 cs=0
> l=0).accept connect_seq 12834 vs existing 12833 state standby
> 2013-08-29 12:04:58.744607 7fa756754700  0 -- 5.9.122.115:6800/977 >>
> 144.76.13.105:0/347604456 pipe(0x52c5a00 sd=38 :6800 s=0 pgs=0 cs=0
> l=0).accept connect_seq 12538 vs existing 12537 state standby
> 2013-08-29 12:04:58.744627 7fa755f4c700 -1 failed to decode message of
> type 23 v2: buffer::end_of_buffer
> 2013-08-29 12:04:58.744671 7fa755f4c700  0 -- 5.9.122.115:6800/977 >>
> 144.76.13.102:0/2955281877 pipe(0x524e500 sd=37 :6800 s=2 pgs=292532
> cs=12835 l=0).fault with nothing to send, going to standby
> 2013-08-29 12:04:58.745006 7fa75614e700  0 -- 5.9.122.115:6800/977 >>
> 144.76.13.103:0/925435369 pipe(0x52c5780 sd=31 :6800 s=0 pgs=0 cs=0
> l=0).accept connect_seq 12830 vs existing 12829 state standby
> 2013-08-29 12:04:58.745102 7fa756754700 -1 failed to decode message of
> type 23 v2: buffer::end_of_buffer
> 2013-08-29 12:04:58.745146 7fa756754700  0 -- 5.9.122.115:6800/977 >>
> 144.76.13.105:0/347604456 pipe(0x52c5a00 sd=38 :6800 s=2 pgs=131368
> cs=12539 l=0).fault with nothing to send, going to standby
>
>
> --
> Kind regards, Serge Slipchenko
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux