Fwd: mount: 10.0.6.10:/: can't read superblock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Trying to remove one of the folders made the mds.a and mds.b to stop. so somting is wrong in my mds. 
ceph -s gives 

2012-06-06 06:19:19.899973 pg v1220573: 1152 pgs: 1152 active+clean; 191 GB data, 393 GB used, 973 GB / 1379 GB avail 
2012-06-06 06:19:19.905097 mds e78: 1/1/1 up {0=c=up:active} 
2012-06-06 06:19:19.905200 osd e1114: 8 osds: 8 up, 8 in 
2012-06-06 06:19:19.905400 log 2012-06-06 05:51:31.499366 osd.3 10.0.6.11:6804/2933 804 : [INF] 0.c scrub ok 
2012-06-06 06:19:19.905598 mon e1: 3 mons at {a=10.0.6.10:6789/0,b=10.0.6.11:6789/0,c=10.0.6.12:6789/0} 

i checked the log files on ceph1 and 2 where I have my mon. 

mds.a ------------------- 
cessful recovery! 
-2> 2012-06-06 05:38:35.956195 7f2d5ea08700 1 mds.0.12 active_start 
-1> 2012-06-06 05:38:35.967760 7f2d5ea08700 1 mds.0.12 cluster recovered. 
0> 2012-06-06 05:38:37.200297 7f2d5ea08700 -1 mds/AnchorServer.cc: In function 'virtual void AnchorServer::handle_query(MMDSTableRequest*)' thread 7f2d5ea08700 time 2012-06-06 05:38:37.198981 
mds/AnchorServer.cc: 249: FAILED assert(anchor_map.count(curino) == 1) 

ceph version 0.46 (commit:cb7f1c9c7520848b0899b26440ac34a8acea58d1) 
1: (AnchorServer::handle_query(MMDSTableRequest*)+0x175) [0x6bdc95] 
2: (MDS::handle_deferrable_message(Message*)+0xd84) [0x4b0474] 
3: (MDS::_dispatch(Message*)+0xaf8) [0x4c50b8] 
4: (MDS::ms_dispatch(Message*)+0x1fb) [0x4c628b] 
5: (SimpleMessenger::dispatch_entry()+0x979) [0x7acb49] 
6: (SimpleMessenger::DispatchThread::entry()+0xd) [0x7336ed] 
7: (()+0x68ca) [0x7f2d6346e8ca] 
8: (clone()+0x6d) [0x7f2d61cf692d] 
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 

--- end dump of recent events --- 
2012-06-06 05:38:37.203277 7f2d5ea08700 -1 *** Caught signal (Aborted) ** 
in thread 7f2d5ea08700 

ceph version 0.46 (commit:cb7f1c9c7520848b0899b26440ac34a8acea58d1) 
1: /usr/bin/ceph-mds() [0x814279] 
2: (()+0xeff0) [0x7f2d63476ff0] 
3: (gsignal()+0x35) [0x7f2d61c591b5] 
4: (abort()+0x180) [0x7f2d61c5bfc0] 
5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f2d624eddc5] 
6: (()+0xcb166) [0x7f2d624ec166] 
7: (()+0xcb193) [0x7f2d624ec193] 
8: (()+0xcb28e) [0x7f2d624ec28e] 
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940) [0x74f9b0] 
10: (AnchorServer::handle_query(MMDSTableRequest*)+0x175) [0x6bdc95] 
11: (MDS::handle_deferrable_message(Message*)+0xd84) [0x4b0474] 
12: (MDS::_dispatch(Message*)+0xaf8) [0x4c50b8] 
13: (MDS::ms_dispatch(Message*)+0x1fb) [0x4c628b] 
14: (SimpleMessenger::dispatch_entry()+0x979) [0x7acb49] 
15: (SimpleMessenger::DispatchThread::entry()+0xd) [0x7336ed] 
16: (()+0x68ca) [0x7f2d6346e8ca] 
17: (clone()+0x6d) [0x7f2d61cf692d] 
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 

--- begin dump of recent events --- 
0> 2012-06-06 05:38:37.203277 7f2d5ea08700 -1 *** Caught signal (Aborted) ** 
in thread 7f2d5ea08700 

ceph version 0.46 (commit:cb7f1c9c7520848b0899b26440ac34a8acea58d1) 
1: /usr/bin/ceph-mds() [0x814279] 
2: (()+0xeff0) [0x7f2d63476ff0] 
3: (gsignal()+0x35) [0x7f2d61c591b5] 
4: (abort()+0x180) [0x7f2d61c5bfc0] 
5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f2d624eddc5] 
6: (()+0xcb166) [0x7f2d624ec166] 
7: (()+0xcb193) [0x7f2d624ec193] 
8: (()+0xcb28e) [0x7f2d624ec28e] 
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940) [0x74f9b0] 
10: (AnchorServer::handle_query(MMDSTableRequest*)+0x175) [0x6bdc95] 
11: (MDS::handle_deferrable_message(Message*)+0xd84) [0x4b0474] 
12: (MDS::_dispatch(Message*)+0xaf8) [0x4c50b8] 
13: (MDS::ms_dispatch(Message*)+0x1fb) [0x4c628b] 
14: (SimpleMessenger::dispatch_entry()+0x979) [0x7acb49] 
15: (SimpleMessenger::DispatchThread::entry()+0xd) [0x7336ed] 
16: (()+0x68ca) [0x7f2d6346e8ca] 
17: (clone()+0x6d) [0x7f2d61cf692d] 
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 

--- end dump of recent events --- 

the ceph -v reports on my diffrent servers 

root@ceph1:~# ceph -v 
ceph version 0.47.2 (commit:8bf9fde89bd6ebc4b0645b2fe02dadb1c17ad372) 
root@ceph1:~# ssh ceph2 ceph -v 
ceph version 0.47.2 (commit:8bf9fde89bd6ebc4b0645b2fe02dadb1c17ad372) 
root@ceph1:~# ssh ceph3 ceph -v 
ceph version 0.47.2 (commit:8bf9fde89bd6ebc4b0645b2fe02dadb1c17ad372) 
root@ceph1:~# ssh ceph4 ceph -v 
ceph version 0.47.2 (commit:8bf9fde89bd6ebc4b0645b2fe02dadb1c17ad372) 

is the 0.46 above reporting when the error occurred or am I running the wrong binaries 
i use the debian packages ? 

mds.b 

0> 2012-06-06 05:38:17.533743 7fae49945700 -1 mds/AnchorServer.cc: In function 'virtual void AnchorServer::handle_query(MMDSTableRequest*)' thread 7fae49945700 time 2012-06-06 05:38:17.523498 
mds/AnchorServer.cc: 249: FAILED assert(anchor_map.count(curino) == 1) 

ceph version 0.47.2 (commit:8bf9fde89bd6ebc4b0645b2fe02dadb1c17ad372) 
1: (AnchorServer::handle_query(MMDSTableRequest*)+0x175) [0x6c1125] 
2: (MDS::handle_deferrable_message(Message*)+0xd84) [0x4b1984] 
3: (MDS::_dispatch(Message*)+0xafa) [0x4c61da] 
4: (MDS::ms_dispatch(Message*)+0x1fb) [0x4c73ab] 
5: (SimpleMessenger::dispatch_entry()+0x979) [0x7b4729] 
6: (SimpleMessenger::DispatchThread::entry()+0xd) [0x7365cd] 
7: (()+0x68ca) [0x7fae4e3ab8ca] 
8: (clone()+0x6d) [0x7fae4cc3392d] 
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 

--- end dump of recent events --- 
2012-06-06 05:38:17.711889 7fae49945700 -1 *** Caught signal (Aborted) ** 
in thread 7fae49945700 

ceph version 0.47.2 (commit:8bf9fde89bd6ebc4b0645b2fe02dadb1c17ad372) 
1: /usr/bin/ceph-mds() [0x81da89] 
2: (()+0xeff0) [0x7fae4e3b3ff0] 
3: (gsignal()+0x35) [0x7fae4cb961b5] 
4: (abort()+0x180) [0x7fae4cb98fc0] 
5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7fae4d42adc5] 
6: (()+0xcb166) [0x7fae4d429166] 
7: (()+0xcb193) [0x7fae4d429193] 
8: (()+0xcb28e) [0x7fae4d42928e] 
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940) [0x7555f0] 
10: (AnchorServer::handle_query(MMDSTableRequest*)+0x175) [0x6c1125] 
11: (MDS::handle_deferrable_message(Message*)+0xd84) [0x4b1984] 
12: (MDS::_dispatch(Message*)+0xafa) [0x4c61da] 
13: (MDS::ms_dispatch(Message*)+0x1fb) [0x4c73ab] 
14: (SimpleMessenger::dispatch_entry()+0x979) [0x7b4729] 
15: (SimpleMessenger::DispatchThread::entry()+0xd) [0x7365cd] 
16: (()+0x68ca) [0x7fae4e3ab8ca] 
17: (clone()+0x6d) [0x7fae4cc3392d] 
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 

--- begin dump of recent events --- 
0> 2012-06-06 05:38:17.711889 7fae49945700 -1 *** Caught signal (Aborted) ** 
in thread 7fae49945700 

ceph version 0.47.2 (commit:8bf9fde89bd6ebc4b0645b2fe02dadb1c17ad372) 
1: /usr/bin/ceph-mds() [0x81da89] 
2: (()+0xeff0) [0x7fae4e3b3ff0] 
3: (gsignal()+0x35) [0x7fae4cb961b5] 
4: (abort()+0x180) [0x7fae4cb98fc0] 
5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7fae4d42adc5] 
6: (()+0xcb166) [0x7fae4d429166] 
7: (()+0xcb193) [0x7fae4d429193] 
8: (()+0xcb28e) [0x7fae4d42928e] 
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940) [0x7555f0] 
10: (AnchorServer::handle_query(MMDSTableRequest*)+0x175) [0x6c1125] 
11: (MDS::handle_deferrable_message(Message*)+0xd84) [0x4b1984] 
12: (MDS::_dispatch(Message*)+0xafa) [0x4c61da] 
13: (MDS::ms_dispatch(Message*)+0x1fb) [0x4c73ab] 
14: (SimpleMessenger::dispatch_entry()+0x979) [0x7b4729] 
15: (SimpleMessenger::DispatchThread::entry()+0xd) [0x7365cd] 
16: (()+0x68ca) [0x7fae4e3ab8ca] 
17: (clone()+0x6d) [0x7fae4cc3392d] 
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 

--- end dump of recent events --- 





> For future reference, that error was because the active MDS server was in replay. I can't tell why it didn't move on to active from what you posted, but I imagine it just got a little stuck since restarting made it work out. 
> -Greg 
> 
> 
> On Tuesday, June 5, 2012 at 1:05 PM, Martin Wilderoth wrote: 
> 
> > Hello Again, 
> > 
> > I restarted the mds on all servers and then it worked again 
> > 
> > /Regards Martin 
> > 
> > > Hello 
> > > 
> > > > Hi Martin, 
> > > > 
> > > > On 06/05/2012 08:07 PM, Martin Wilderoth wrote: 
> > > > > Hello 
> > > > > 
> > > > > Is there a way to recover this error. 
> > > > > 
> > > > > mount -t ceph 10.0.6.10:/ /mnt -vv -o name=admin,secret=XXXXXXXXXXXXXXXXXXXXXXX 
> > > > > [ 506.640433] libceph: loaded (mon/osd proto 15/24, osdmap 5/6 5/6) 
> > > > > [ 506.650594] ceph: loaded (mds proto 32) 
> > > > > [ 506.652353] libceph: client0 fsid a9d5f9e1-4bb9-4fab-b79b-ba4457631b01 
> > > > > [ 506.670876] Intel AES-NI instructions are not detected. 
> > > > > [ 506.678861] libceph: mon0 10.0.6.10:6789 session established 
> > > > > mount: 10.0.6.10:/: can't read superblock 
> > > > 
> > > > 
> > > > 
> > > > Could you share some more information? For example the output from: ceph -s 
> > > 
> > > 2012-06-05 20:25:05.307914 pg v1189604: 1152 pgs: 1152 active+clean; 191 GB data, 393 GB used, 973 GB / 1379 GB > avail 
> > > 012-06-05 20:25:05.315871 mds e60: 1/1/1 up {0=c=up:replay}, 2 up:standby 
> > > 2012-06-05 20:25:05.315965 osd e1106: 8 osds: 8 up, 8 in 
> > > 2012-06-05 20:25:05.316165 log 2012-06-05 20:24:50.425527 mon.0 10.0.6.10:6789/0 75 : [INF] mds.? >10.0.6.11:6800/22974 up:boot 
> > > 2012-06-05 20:25:05.316371 mon e1: 3 mons at {a=10.0.6.10:6789/0,b=10.0.6.11:6789/0,c=10.0.6.12:6789/0} 
> > > 
> > > 
> > > > 
> > > > Did you change anything to the cluster since it worked? And what version 
> > > > are you running? 
> > > 
> > > 
> > > 
> > > I have not done any changes installed at version 0.46 upgraded earlier and have been testing with 
> > > ceph and ceph-fuse and backuppc. It was during the ceph-fuse it hanged. 
> > > 
> > > Current version 
> > > ceph version 0.47.2 (commit:8bf9fde89bd6ebc4b0645b2fe02dadb1c17ad372) 
> > > 
> > > > > One of my mds logs has 24G of data. 
> > > > 
> > > > Is it still running? 
> > > I have restarted mds.a and mds.b they seems to be running. But not everything. 
> > > mds.a was stoped not sure mds.b but it has a big logfile. 
> > > 
> > > > 
> > > > > 
> > > > > I have some rbd devices that I would like to keep. 
> > > > 
> > > > RBD doesn't use the MDS nor the POSIX filesystem, so you will probably 
> > > > be fine, but we need the output of "ceph -s" first. 
> > > > 
> > > > Does this work? 
> > > > $ rbd ls 
> > > 
> > > 
> > > this works I'm still using the rbd with no problem 
> > > > $ rados -p rbd ls 
> > > 
> > > 
> > > seems to work reports something simmilar to 
> > > rb.0.2.00000000052e 
> > > rb.0.0.0000000002f2 
> > > rb.0.7.000000000345 
> > > rb.0.7.000000000896 
> > > rb.0.0.000000000102 
> > > rb.0.9.000000000172 
> > > rb.0.1.000000000350 
> > > rb.0.4.000000000180 
> > > rb.0.4.00000000068b 
> > > rb.0.5.00000000054c 
> > > rb.0.2.0000000001e1 
> > > 
> > > > Wido 
> > > > 
> > > > > 
> > > > > /Regards Martin 


Regards / Med Vänlig Hälsning Martin Wilderoth VD Linserv AB Enhagsslingan 4A SE-187 40 Täby www.linserv.se Tel: +46(0)8-473 60 63 Fax: +46(0)70-969 09 19 Email: martin.wilderoth@xxxxxxxxxx , 

Regards / Med Vänlig Hälsning 
Martin Wilderoth 
VD 




	

Linserv AB 
Enhagsslingan 4A 
SE-187 40 Täby 
www.linserv.se 

	
Tel: +46(0)8-473 60 63 
Fax: +46(0)70-969 09 19 
Email: martin.wilderoth@xxxxxxxxxx 
, 













--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux