Re: mds crash

Gregory Farnum <greg@xxxxxxxxxxx> · Thu, 21 Feb 2013 09:32:58 -0800

That backtrace is pretty generic (most every crash during replay
produces that). Can you add

debug mds = 20
debug journaler = 20
debug ms =1

to your MDS' ceph.conf, restart it, and put that log somewhere accessible?
-Greg

On Wed, Feb 20, 2013 at 11:12 PM, Steffen Thorhauer
<thorhaue@xxxxxxxxxxxxxxxxxxxxxxx> wrote:
> Hello,
> I have a test ceph cluster on ubuntu 12.04 and made yesterday an upgrade to 0.57 .
> But after the upgrade the mds dies.
> ceph -s says
>   health HEALTH_WARN mds 0 is laggy
>    monmap e2: 5 mons at {0=10.37.124.161:6789/0,1=10.37.124.162:6789/0,2=10.37.124.163:6789/0,3=10.37.124.164:6789/0,4=10.37.124.167:6789/0}, election epoch 46, quorum 0,1,2,3,4 0,1,2,3,4
>    osdmap e901: 6 osds: 6 up, 6 in
>     pgmap v89987: 1280 pgs: 1280 active+clean; 123 GB data, 373 GB used, 225 GB / 599 GB avail
>    mdsmap e2296: 1/1/1 up {0=0=up:replay(laggy or crashed)}
> i
> I get only a crash report in
> /var/log/ceph/ceph-mds.0.log
>    0> 2013-02-21 08:04:02.892956 7f8d50955700 -1 *** Caught signal (Aborted) **
>  in thread 7f8d50955700
>
>  ceph version 0.57 (9a7a9d06c0623ccc116a1d3b71c765c20a17e98e)
>  1: /usr/bin/ceph-mds() [0x81b03a]
>  2: (()+0xfcb0) [0x7f8d58a59cb0]
>  3: (gsignal()+0x35) [0x7f8d57835425]
>  4: (abort()+0x17b) [0x7f8d57838b8b]
>  5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f8d5818769d]
>  6: (()+0xb5846) [0x7f8d58185846]
>  7: (()+0xb5873) [0x7f8d58185873]
>  8: (()+0xb596e) [0x7f8d5818596e]
>  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1df) [0x7833cf]
>  10: (EMetaBlob::replay(MDS*, LogSegment*, MDSlaveUpdate*)+0x1e34) [0x4e0d94]
>  11: (EUpdate::replay(MDS*)+0x3a) [0x4e8e1a]
>  12: (MDLog::_replay_thread()+0x438) [0x6ad448]
>  13: (MDLog::ReplayThread::entry()+0xd) [0x4cc0ad]
>  14: (()+0x7e9a) [0x7f8d58a51e9a]
>  15: (clone()+0x6d) [0x7f8d578f2cbd]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
>
> --- logging levels ---
>    0/ 5 none
>    0/ 1 lockdep
>    0/ 1 context
>    1/ 1 crush
>    1/ 5 mds
>    1/ 5 mds_balancer
>    1/ 5 mds_locker
>    1/ 5 mds_log
>    1/ 5 mds_log_expire
>    1/ 5 mds_migrator
>    0/ 1 buffer
>    0/ 1 timer
>    0/ 1 filer
>    0/ 1 striper
>    0/ 1 objecter
>    0/ 5 rados
>    0/ 5 rbd
>    0/ 5 journaler
>    0/ 5 objectcacher
>    0/ 5 client
>    0/ 5 osd
>    0/ 5 optracker
>    0/ 5 objclass
>    1/ 3 filestore
>    1/ 3 journal
>    0/ 5 ms
>    1/ 5 mon
>    0/10 monc
>    0/ 5 paxos
>    0/ 5 tp
>    1/ 5 auth
>    1/ 5 crypto
>    1/ 1 finisher
>    1/ 5 heartbeatmap
>    1/ 5 perfcounter
>    1/ 5 rgw
>    1/ 5 hadoop
>    1/ 5 javaclient
>    1/ 5 asok
>    1/ 1 throttle
>   -2/-2 (syslog threshold)
>   -1/-1 (stderr threshold)
>   max_recent    100000
>   max_new         1000
>   log_file /var/log/ceph/ceph-mds.0.log
> --- end dump of recent events ---
>
> Regards,
>   Steffen Thorhauer
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com