Cluster with 4 nodes
node 1: 2 HDDs
node 2: 3 HDDs
node 3: 3 HDDs
node 4: 2 HDDs
After a problem with upgrade from 13.2.1 to 13.2.2 (I restarted the
nodes 1 at a time)
I upgraded with ubuntu apt-get upgrade. I had 1 acvive mds at a time
when did the upgrade.
All MDSs stopped working
Status shows 1 crashed and no one in standby.
If I restart an MDS status shows replay then crash with this log output:
ceph version 13.2.2 (02899bfda814146b021136e9d8e80eba494e1126) mimic
(stable)
1: (()+0x3f5480) [0x555de8a51480]
2: (()+0x12890) [0x7f6e4cb41890]
3: (gsignal()+0xc7) [0x7f6e4bc39e97]
4: (abort()+0x141) [0x7f6e4bc3b801]
5: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x250) [0x7f6e4d22a710]
6: (()+0x26c787) [0x7f6e4d22a787]
7: (EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)+0x5f4b)
[0x555de8a3c83b]
8: (EUpdate::replay(MDSRank*)+0x39) [0x555de8a3dd79]
9: (MDLog::_replay_thread()+0x864) [0x555de89e6e04]
10: (MDLog::ReplayThread::entry()+0xd) [0x555de8784ebd]
11: (()+0x76db) [0x7f6e4cb366db]
12: (clone()+0x3f) [0x7f6e4bd1c88f]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed
to interpret this
journal reports OK
Now im trying:
cephfs-data-scan scan_extents cephfs_data
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com