Crashing MDS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Everyone,

I have an MDS server that's crashing moments after it starts. The filesystem is set to max_mds=5 and mds.[1-4] are all up and active but mds.0 keeps crashing.  all I can see is the following in the /var/log/ceph/ceph-mds..... logfile.  Any thoughts?


    -2> 2022-06-08 10:02:59.408 7fc0a479d700  4 mds.0.server handle_client_request client_request(client.162790796:2313708455 create #0x500012ba51c/0192.jpg 2022-06-08 09:06:03.780237 RETRY=19 caller_uid=10363898, caller_gid=10363898{10363898,}) v4     -1> 2022-06-08 10:02:59.410 7fc0a479d700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.9/rpm/el7/BUILD/ceph-14.2.9/src/mds/MDCache.cc: In function 'void MDCache::add_inode(CInode*)' thread 7fc0a479d700 time 2022-06-08 10:02:59.409711 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.9/rpm/el7/BUILD/ceph-14.2.9/src/mds/MDCache.cc: 279: FAILED ceph_assert(!p)

 ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) nautilus (stable)  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14a) [0x7fc0b2cbb875]
 2: (()+0x253a3d) [0x7fc0b2cbba3d]
 3: (()+0x20f84e) [0x56386170d84e]
 4: (Server::prepare_new_inode(boost::intrusive_ptr<MDRequestImpl>&, CDir*, inodeno_t, unsigned int, file_layout_t*)+0x4ad) [0x5638616a0f1d]  5: (Server::handle_client_openc(boost::intrusive_ptr<MDRequestImpl>&)+0xcf1) [0x5638616b0a91]  6: (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0xb5b) [0x5638616d78fb]  7: (Server::handle_client_request(boost::intrusive_ptr<MClientRequest const> const&)+0x2f8) [0x5638616d7d78]  8: (Server::dispatch(boost::intrusive_ptr<Message const> const&)+0x122) [0x5638616e3722]  9: (MDSRank::handle_deferrable_message(boost::intrusive_ptr<Message const> const&)+0x6dc) [0x5638616585ec]  10: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x7ea) [0x56386165aa4a]  11: (MDSRank::retry_dispatch(boost::intrusive_ptr<Message const> const&)+0x12) [0x56386165af72]
 12: (MDSContext::complete(int)+0x74) [0x5638618c9ce4]
 13: (MDSRank::_advance_queues()+0xa4) [0x563861659ac4]
 14: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x1d0) [0x56386165a430]  15: (MDSRank::retry_dispatch(boost::intrusive_ptr<Message const> const&)+0x12) [0x56386165af72]
 16: (MDSContext::complete(int)+0x74) [0x5638618c9ce4]
 17: (MDSRank::_advance_queues()+0xa4) [0x563861659ac4]
 18: (MDSRank::ProgressThread::entry()+0x3d) [0x56386165a13d]
 19: (()+0x7ea5) [0x7fc0b0b7aea5]
 20: (clone()+0x6d) [0x7fc0af8288dd]

     0> 2022-06-08 10:02:59.413 7fc0a479d700 -1 *** Caught signal (Aborted) **
 in thread 7fc0a479d700 thread_name:mds_rank_progr

 ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) nautilus (stable)
 1: (()+0xf630) [0x7fc0b0b82630]
 2: (gsignal()+0x37) [0x7fc0af760387]
 3: (abort()+0x148) [0x7fc0af761a78]
 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x199) [0x7fc0b2cbb8c4]
 5: (()+0x253a3d) [0x7fc0b2cbba3d]
 6: (()+0x20f84e) [0x56386170d84e]
 7: (Server::prepare_new_inode(boost::intrusive_ptr<MDRequestImpl>&, CDir*, inodeno_t, unsigned int, file_layout_t*)+0x4ad) [0x5638616a0f1d]  8: (Server::handle_client_openc(boost::intrusive_ptr<MDRequestImpl>&)+0xcf1) [0x5638616b0a91]  9: (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0xb5b) [0x5638616d78fb]  10: (Server::handle_client_request(boost::intrusive_ptr<MClientRequest const> const&)+0x2f8) [0x5638616d7d78]  11: (Server::dispatch(boost::intrusive_ptr<Message const> const&)+0x122) [0x5638616e3722]  12: (MDSRank::handle_deferrable_message(boost::intrusive_ptr<Message const> const&)+0x6dc) [0x5638616585ec]  13: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x7ea) [0x56386165aa4a]  14: (MDSRank::retry_dispatch(boost::intrusive_ptr<Message const> const&)+0x12) [0x56386165af72]
 15: (MDSContext::complete(int)+0x74) [0x5638618c9ce4]
 16: (MDSRank::_advance_queues()+0xa4) [0x563861659ac4]
 17: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x1d0) [0x56386165a430]  18: (MDSRank::retry_dispatch(boost::intrusive_ptr<Message const> const&)+0x12) [0x56386165af72]
 19: (MDSContext::complete(int)+0x74) [0x5638618c9ce4]
 20: (MDSRank::_advance_queues()+0xa4) [0x563861659ac4]
 21: (MDSRank::ProgressThread::entry()+0x3d) [0x56386165a13d]
 22: (()+0x7ea5) [0x7fc0b0b7aea5]
 23: (clone()+0x6d) [0x7fc0af8288dd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
<snip>


Thanks

-Dave

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux