Re: Error in MDS (laggy or creshed)

"Yan, Zheng" <ukernel@xxxxxxxxx> · Mon, 8 Oct 2018 10:18:51 +0800

Sorry there is bug in 13.2.2 that breaks compatibility of purge queue
disk format. Please downgrading mds to 13.2.1, then run 'ceph mds
repaired cephfs_name:0'.

Regards
Yan, Zheng
On Mon, Oct 8, 2018 at 9:20 AM Alfredo Daniel Rezinovsky
<alfrenovsky@xxxxxxxxx> wrote:
>
> Cluster with 4 nodes
>
> node 1: 2 HDDs
> node 2: 3 HDDs
> node 3: 3 HDDs
> node 4: 2 HDDs
>
> After a problem with upgrade from 13.2.1 to 13.2.2 (I restarted the
> nodes 1 at a time, think that was the problem)
>
> I upgraded with ubuntu apt-get upgrade. I had 1 active mds at a time
> when did the upgrade.
>
> All MDSs stopped working
>
> Status shows 1 crashed and no one in standby.
>
> If I restart an MDS status shows replay then crash with this log output:
>
>   ceph version 13.2.2 (02899bfda814146b021136e9d8e80eba494e1126) mimic
> (stable)
> 1: (()+0x3f5480) [0x555de8a51480]
> 2: (()+0x12890) [0x7f6e4cb41890]
> 3: (gsignal()+0xc7) [0x7f6e4bc39e97]
> 4: (abort()+0x141) [0x7f6e4bc3b801]
> 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x250) [0x7f6e4d22a710]
> 6: (()+0x26c787) [0x7f6e4d22a787]
> 7: (EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)+0x5f4b)
> [0x555de8a3c83b]
> 8: (EUpdate::replay(MDSRank*)+0x39) [0x555de8a3dd79]
> 9: (MDLog::_replay_thread()+0x864) [0x555de89e6e04]
> 10: (MDLog::ReplayThread::entry()+0xd) [0x555de8784ebd]
> 11: (()+0x76db) [0x7f6e4cb366db]
> 12: (clone()+0x3f) [0x7f6e4bd1c88f]
> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed
> to interpret this
>
> journal reports OK
>
> Now im trying:
>
>   cephfs-data-scan scan_extents cephfs_data
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com