Thanks for the info
, paul.
Our cluster is 130gb in size, at present. We are starting out in ceph
adoption in our company.
At present, I am looking for guidance from the community. It ll help us, as well in learning more about the product and available support.
Thanks,
On Fri, 10 Aug 2018 at 9:52 PM, Paul Emmerich <paul.emmerich@xxxxxxxx> wrote:
Sorry, a step-by-step guide through something like thatis beyond the scope of what we can do on a mailing list.But what I would do here is carefully asses the situation/the damage. My wild guess would be to reset and rebuildthe inode table but that might be incorrect and unsafewithout further looking into it.I don't want to solicit our services here, but we do Cephrecoveries regularly; reach out to us if you are lookingfor a consultant.Paul2018-08-10 18:05 GMT+02:00 Amit Handa <amit.handa@xxxxxxxxx>:Thanks alot, Paul.we did (hopefully) follow through with the disaster recovery.however, please guide me in how to get the cluster back up !Thanks,--On Fri, Aug 10, 2018 at 9:32 PM Paul Emmerich <paul.emmerich@xxxxxxxx> wrote:Looks like you got some duplicate inodes due to corrupted metadata, youlikely tried to a disaster recovery and didn't follow through it completely oryou hit some bug in Ceph.The solution here is probably to do a full recovery of the metadata/fullbackwards scan after resetting the inodes. I've recovered a cluster fromsomething similar just a few weeks ago. Annoying but recoverable.Paul2018-08-10 13:26 GMT+02:00 Amit Handa <amit.handa@xxxxxxxxx>:We are facing constant crash from ceph mds. We have installed mimic (v13.2.1).
mds: cephfs-1/1/1 up {0=node2=up:active(laggy or crashed)}
mds logs: https://pastebin.com/AWGMLRm0
we have followed the DR steps listed at
http://docs.ceph.com/docs/mimic/cephfs/disaster-recovery/
please help in resolving the errors :(
mds crash stacktrace
ceph version 13.2.1 (5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0xff) [0x7f984fc3ee1f]
2: (()+0x284fe7) [0x7f984fc3efe7]
3: (()+0x2087fe) [0x5563e88537fe]
4: (Server::prepare_new_inode(boost::intrusive_ptr<MDRequestImpl>&, CDir*, inodeno_t, unsigned int, file_layout_t*)+0xf37) [0x5563e87ce777]
5: (Server::handle_client_openc(boost::intrusive_ptr<MDRequestImpl>&)+0xdb0) [0x5563e87d0bd0]
6: (Server::handle_client_request(MClientRequest*)+0x49e) [0x5563e87d3c0e]
7: (Server::dispatch(Message*)+0x2db) [0x5563e87d789b]
8: (MDSRank::handle_deferrable_message(Message*)+0x434) [0x5563e87514b4]
9: (MDSRank::_dispatch(Message*, bool)+0x63b) [0x5563e875db5b]
10: (MDSRank::retry_dispatch(Message*)+0x12) [0x5563e875e302]
11: (MDSInternalContextBase::complete(int)+0x67) [0x5563e89afb57]
12: (MDSRank::_advance_queues()+0xd1) [0x5563e875cd51]
13: (MDSRank::ProgressThread::entry()+0x43) [0x5563e875d3e3]
14: (()+0x7e25) [0x7f984d869e25]
15: (clone()+0x6d) [0x7f984c949bad]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
--Loading ...
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90Loading ...
--Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com