Dear list,
we run a 7 node proxmox cluster with ceph nautilus (14.2.18), with 2
ceph filesystems, mounted in debian buster VMs using the cephfs kernel
module.
4 times in the last 6 months we had all mds servers failing one after
the other with an assert, either in the rename_prepare or unlink_local
functions.
Here is the latest one, from Yesterday:
-1> 2021-07-26 17:05:37.046 7fc518787700 -1
/build/ceph/ceph-14.2.18/src/mds/Server.cc: In function 'void
Server::_rename_prepare(MDRequestRef&, EMetaBlob*, ceph::bufferlist*,
CDentry*, CDentry*, CDentry*)' thread 7fc518787700 time 2021-07-26
17:05:37.049487
/build/ceph/ceph-14.2.18/src/mds/Server.cc: 8435: FAILED
ceph_assert(srci->first <= destdn->first)
ceph version 14.2.18 (0cf7f22162b9b2809afe64b2e01779bdd70b850c)
nautilus (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x152) [0x7fc522ac5cf2]
2: (()+0x277eca) [0x7fc522ac5eca]
3: (Server::_rename_prepare(boost::intrusive_ptr<MDRequestImpl>&,
EMetaBlob*, ceph::buffer::v14_2_0::list*, CDentry*, CDentry*,
CDentry*)+0x2d3e) [0x561fd3be626e]
4:
(Server::handle_client_rename(boost::intrusive_ptr<MDRequestImpl>&)+0xc21)
[0x561fd3be6f71]
5:
(Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0xcd4)
[0x561fd3bf6454]
6:
(MDCache::dispatch_request(boost::intrusive_ptr<MDRequestImpl>&)+0x38)
[0x561fd3c82c68]
7: (MDSContext::complete(int)+0x7f) [0x561fd3df745f]
8: (MDSRank::_advance_queues()+0xac) [0x561fd3b724dc]
9: (MDSRank::ProgressThread::entry()+0x3d) [0x561fd3b72a6d]
10: (()+0x7fa3) [0x7fc521e8dfa3]
11: (clone()+0x3f) [0x7fc52181f4cf]
Restarting the mds is not sufficient: the in-kernel cephfs client would
re-issue the command and the crash would re-occur.
It is "solved" by identifying the VM issuing the rename, shutting it
down and restarting the mds.
Then the folder containing the offending file(s) is identified, moved to
quarantine and replaced by a copy.
We were each time able to copy the folder using rsync. Copies didn't
trigger the assert.
There wasn't any failure prior to the assert: Yesterday I had marked 4
OSDs out a few minutes ago and had checked ceph status:
they were backfilling OK.
1. Did this happen to somebody else?
2. How can we identify broken files? How can we delete them or repair
the metadata?
I'll be happy to provide more information if necessary.
Thanks,
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx