----- Message from Gregory Farnum <greg at inktank.com> --------- Date: Tue, 1 Apr 2014 09:03:17 -0700 From: Gregory Farnum <greg at inktank.com> Subject: Re: ceph 0.78 mon and mds crashing (bus error) To: "Yan, Zheng" <ukernel at gmail.com> Cc: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>, ceph-users <ceph-users at lists.ceph.com> > On Tue, Apr 1, 2014 at 7:12 AM, Yan, Zheng <ukernel at gmail.com> wrote: >> On Tue, Apr 1, 2014 at 10:02 PM, Kenneth Waegeman >> <Kenneth.Waegeman at ugent.be> wrote: >>> After some more searching, I've found that the source of the >>> problem is with >>> the mds and not the mon.. The mds crashes, generates a core dump that eats >>> the local space, and in turn the monitor (because of leveldb) crashes. >>> >>> The error in the mds log of one host: >>> >>> 2014-04-01 15:46:34.414615 7f870e319700 0 -- 10.141.8.180:6836/13152 >> >>> 10.141.8.180:6789/0 pipe(0x517371180 sd=54 :42439 s=4 pgs=0 cs=0 l=1 >>> c=0x147ac780).connect got RESETSESSION but no longer connecting >>> 2014-04-01 15:46:34.438792 7f871194f700 0 -- 10.141.8.180:6836/13152 >> >>> 10.141.8.180:6789/0 pipe(0x1b099f580 sd=8 :43150 s=4 pgs=0 cs=0 l=1 >>> c=0x1fd44360).connect got RESETSESSION but no longer connecting >>> 2014-04-01 15:46:34.439028 7f870e319700 0 -- 10.141.8.180:6836/13152 >> >>> 10.141.8.182:6789/0 pipe(0x13aa64880 sd=54 :37085 s=4 pgs=0 cs=0 l=1 >>> c=0x1fd43de0).connect got RESETSESSION but no longer connecting >>> 2014-04-01 15:46:34.468257 7f871b7ae700 -1 mds/CDir.cc: In function 'void >>> CDir::_omap_fetched(ceph::bufferlist&, std::map<std::basic_string<char, >>> std::char_traits<char>, std::allocator<char> >, ceph::buffer::list, >>> std::less<std::basic_string<char, std::char_traits<char>, >>> std::allocator<char> > >, std::allocator<std::pair<const >>> std::basic_string<char, std::char_traits<char>, std::allocator<char> >, >>> ceph::buffer::list> > >&, const std::string&, int)' thread >>> 7f871b7ae700 time >>> 2014-04-01 15:46:34.448320 >>> mds/CDir.cc: 1474: FAILED assert(r == 0 || r == -2 || r == -61) >>> >> >> could you use gdb to check what is value of variable 'r' . > > If you look at the crash dump log you can see the return value in the > osd_op_reply message: > -1> 2014-04-01 15:46:34.440860 7f871b7ae700 1 -- > 10.141.8.180:6836/13152 <== osd.3 10.141.8.180:6827/4366 33077 ==== > osd_op_reply(4179177 100001f2ef1.00000000 [omap-get-header > 0~0,omap-get-vals 0~16] v0'0 uv0 ack = -108 (Cannot send after > transport endpoint shutdown)) v6 ==== 229+0+0 (958358678 0 0) > 0x2cff7aa80 con 0x37ea3c0 > > -108, which is ESHUTDOWN, but we also use it (via the 108 constant, I > think because ESHUTDOWN varies across platforms) as EBLACKLISTED. > So it looks like this is itself actually a symptom of another problem > that is causing the MDS to get timed out on the monitor. If a core > dump is "eating the local space", maybe the MDS is stuck in an > infinite allocation loop of some kind? How big are your disks, > Kenneth? Do you have any information on how much CPU/memory the MDS > was using before this? I monitored the mds process after restart: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 19215 root 20 0 6070m 5.7g 5236 S 778.6 18.1 1:27.54 ceph-mds 19215 root 20 0 7926m 7.5g 5236 S 179.2 23.8 2:44.39 ceph-mds 19215 root 20 0 12.4g 12g 5236 S 157.2 38.8 3:43.47 ceph-mds 19215 root 20 0 16.6g 16g 5236 S 144.4 52.0 4:15.01 ceph-mds 19215 root 20 0 19.9g 19g 5236 S 137.2 62.5 4:35.83 ceph-mds 19215 root 20 0 24.5g 24g 5224 S 136.5 77.0 5:04.66 ceph-mds 19215 root 20 0 25.8g 25g 2944 S 33.7 81.2 5:13.74 ceph-mds 19215 root 20 0 26.0g 25g 2916 S 24.6 81.7 5:19.07 ceph-mds 19215 root 20 0 26.1g 25g 2916 S 13.0 82.1 5:22.16 ceph-mds 19215 root 20 0 27.7g 26g 1856 S 100.0 85.8 5:36.46 ceph-mds Then it crashes. I changed the core dump location out of the root fs, the core dump is indeed about 26G My disks: Filesystem Size Used Avail Use% Mounted on /dev/sda2 9.9G 2.9G 6.5G 31% / tmpfs 16G 0 16G 0% /dev/shm /dev/sda1 248M 53M 183M 23% /boot /dev/sda4 172G 61G 112G 35% /var/lib/ceph/log/sda4 /dev/sdb 187G 61G 127G 33% /var/lib/ceph/log/sdb /dev/sdc 3.7T 1.7T 2.0T 47% /var/lib/ceph/osd/sdc /dev/sdd 3.7T 1.5T 2.2T 41% /var/lib/ceph/osd/sdd /dev/sde 3.7T 1.4T 2.4T 37% /var/lib/ceph/osd/sde /dev/sdf 3.7T 1.5T 2.3T 39% /var/lib/ceph/osd/sdf /dev/sdg 3.7T 2.1T 1.7T 56% /var/lib/ceph/osd/sdg /dev/sdh 3.7T 1.7T 2.0T 47% /var/lib/ceph/osd/sdh /dev/sdi 3.7T 1.7T 2.0T 47% /var/lib/ceph/osd/sdi /dev/sdj 3.7T 1.7T 2.0T 47% /var/lib/ceph/osd/sdj /dev/sdk 3.7T 2.1T 1.6T 58% /var/lib/ceph/osd/sdk /dev/sdl 3.7T 1.7T 2.0T 46% /var/lib/ceph/osd/sdl /dev/sdm 3.7T 1.5T 2.2T 41% /var/lib/ceph/osd/sdm /dev/sdn 3.7T 1.4T 2.3T 38% /var/lib/ceph/osd/sdn > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com ----- End message from Gregory Farnum <greg at inktank.com> ----- -- Met vriendelijke groeten, Kenneth Waegeman