The clients can’t work and unmount after the crash of mds

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I found the clients(1 local,2 remote) can’t access ceph today.

root@ceph01:/ # ceph -s
10.09.22_20:05:48.344485    pg v24138: 1320 pgs: 1320 active+clean;
111 GB data, 286 GB used, 924 GB / 1210 GB avail
10.09.22_20:05:48.352327   mds e28: 1/1/1 up {0=up:active(laggy or crashed)}
10.09.22_20:05:48.375330   osd e255: 5 osds: 5 up, 5 in
10.09.22_20:05:48.388638   log 10.09.21_10:08:33.761798 mds0
***.***.248.176:6800/3780 11 : [INF] closing stale session client4370
1**.***.229.105:0/3599534167 after 300.097931
10.09.22_20:05:48.419286   mon e1: 1 mons at ***.***.248.177:6789/0

root@ceph01/# ceph mds dump -o -
10.09.22_20:32:13.770455 mon <- [mds,dump]
10.09.22_20:32:13.788468 mon0 -> 'dumped mdsmap epoch 28' (0)
epoch 28

client_epoch 0
created 10.08.26_03:27:01.753124
modified 10.09.21_23:40:05.176168
tableserver 0
root 0
session_timeout 60
session_autoclose 300

compat  compat={},rocompat={},incompat={1=base v0.20}

max_mds 1
in      0
up      {0=4298}
failed
stopped
4298:   ???.???.248.176:6800/3780 'ceph02' mds0.6 up:active seq 260551
laggy since 10.09.21_23:40:05.160654

10.09.22_20:32:13.788619 wrote 358 byte payload to –


The core dump file of ceph02(???.???.248.176) is as following:
…
Core was generated by `/usr/bin/cmds -i cep02 -c /tmp/ceph.conf.19923'.
Program terminated with signal 6, Aborted.
#0  0x004ca422 in __kernel_vsyscall ()

 (gdb) bt
#0  0x004ca422 in __kernel_vsyscall ()
#1  0x007c9651 in raise () from /lib/tls/i686/cmov/libc.so.6
#2  0x007cca82 in abort () from /lib/tls/i686/cmov/libc.so.6
#3  0x00cab52f in __gnu_cxx::__verbose_terminate_handler() () from
/usr/lib/libstdc++.so.6
#4  0x00ca9465 in ?? () from /usr/lib/libstdc++.so.6
#5  0x00ca94a2 in std::terminate() () from /usr/lib/libstdc++.so.6
#6  0x00ca95e1 in __cxa_throw () from /usr/lib/libstdc++.so.6
#7  0x08312a7b in ceph::__ceph_assert_fail(char const*, char const*,
int, char const*) ()
#8  0x080c3eeb in SimpleMessenger::Pipe::accept() ()
#9  0x080c4ba0 in SimpleMessenger::Pipe::reader() ()
#10 0x080b7d14 in SimpleMessenger::Pipe::Reader::entry() ()
#11 0x080ca7c1 in Thread::_entry_func(void*) ()
#12 0x004a896e in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#13 0x0086ca4e in clone () from /lib/tls/i686/cmov/libc.so.6
(gdb)

The tail of mds log on ceph02 is as following:

10.09.21_23:37:20.331651 b4d43b70 -- ???.???.248.176:6800/3780 -->
mon0 ???.???.248.177:6789/0 -- mdsbeacon(4298/lz05 up:active seq
497179 v27) v1 -- ?+0 0x8be0de0
10.09.21_23:37:20.332446 b6347b70 -- ???.???.248.176:6800/3780 <==
mon0 ???.???.248.177:6789/0 511955 ==== mdsbeacon(4298/ceph02
up:active seq 497179 v27) v1 ==== 70+0+0 (116627713 0 0) 0xb2c967a8
10.09.21_23:37:22.602613 af9ffb70 -- ???.???.248.176:6800/3780 >>
???.??.229.124:0/2562359250 pipe(0x8c26b80 sd=23 pgs=0 cs=0
l=0).accept peer addr is really ???.???.229.124:0/2562359250 (socket
is  ???.???.229.124:52272/0)
10.09.21_23:37:22.602813 af9ffb70 -- ???.???.248.176:6800/3780 >>
???.???.229.124:0/2562359250 pipe(0x8c26b80 sd=23 pgs=0 cs=0
l=0).accept connect_seq 1 vs existing 1 state 2
msg/SimpleMessenger.cc: In function 'int SimpleMessenger::Pipe::accept()':
msg/SimpleMessenger.cc:740: FAILED assert(existing->state ==
STATE_CONNECTING || existing->state == STATE_STANDBY ||
existing->state == STATE_WAIT)
 1: (SimpleMessenger::Pipe::reader()+0x830) [0x80c4ba0]
 2: (SimpleMessenger::Pipe::Reader::entry()+0x14) [0x80b7d14]
 3: (Thread::_entry_func(void*)+0x11) [0x80ca7c1]
 4: (()+0x596e) [0x4a896e]
 5: (clone()+0x5e) [0x86ca4e]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux