Hi Dave, Just to make sure, have you checked if the host has free inode available? On Wed, 8 Jun 2022 at 19:22, Dave Schulz <dschulz@xxxxxxxxxxx> wrote: > Hi Everyone, > > I have an MDS server that's crashing moments after it starts. The > filesystem is set to max_mds=5 and mds.[1-4] are all up and active but > mds.0 keeps crashing. all I can see is the following in the > /var/log/ceph/ceph-mds..... logfile. Any thoughts? > > > -2> 2022-06-08 10:02:59.408 7fc0a479d700 4 mds.0.server > handle_client_request client_request(client.162790796:2313708455 create > #0x500012ba51c/0192.jpg 2022-06-08 09:06:03.780237 RETRY=19 > caller_uid=10363898, caller_gid=10363898{10363898,}) v4 > -1> 2022-06-08 10:02:59.410 7fc0a479d700 -1 > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.9/rpm/el7/BUILD/ceph-14.2.9/src/mds/MDCache.cc: > > In function 'void MDCache::add_inode(CInode*)' thread 7fc0a479d700 time > 2022-06-08 10:02:59.409711 > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.9/rpm/el7/BUILD/ceph-14.2.9/src/mds/MDCache.cc: > > 279: FAILED ceph_assert(!p) > > ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) > nautilus (stable) > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x14a) [0x7fc0b2cbb875] > 2: (()+0x253a3d) [0x7fc0b2cbba3d] > 3: (()+0x20f84e) [0x56386170d84e] > 4: (Server::prepare_new_inode(boost::intrusive_ptr<MDRequestImpl>&, > CDir*, inodeno_t, unsigned int, file_layout_t*)+0x4ad) [0x5638616a0f1d] > 5: > (Server::handle_client_openc(boost::intrusive_ptr<MDRequestImpl>&)+0xcf1) > [0x5638616b0a91] > 6: > (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0xb5b) > > [0x5638616d78fb] > 7: (Server::handle_client_request(boost::intrusive_ptr<MClientRequest > const> const&)+0x2f8) [0x5638616d7d78] > 8: (Server::dispatch(boost::intrusive_ptr<Message const> > const&)+0x122) [0x5638616e3722] > 9: (MDSRank::handle_deferrable_message(boost::intrusive_ptr<Message > const> const&)+0x6dc) [0x5638616585ec] > 10: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, > bool)+0x7ea) [0x56386165aa4a] > 11: (MDSRank::retry_dispatch(boost::intrusive_ptr<Message const> > const&)+0x12) [0x56386165af72] > 12: (MDSContext::complete(int)+0x74) [0x5638618c9ce4] > 13: (MDSRank::_advance_queues()+0xa4) [0x563861659ac4] > 14: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, > bool)+0x1d0) [0x56386165a430] > 15: (MDSRank::retry_dispatch(boost::intrusive_ptr<Message const> > const&)+0x12) [0x56386165af72] > 16: (MDSContext::complete(int)+0x74) [0x5638618c9ce4] > 17: (MDSRank::_advance_queues()+0xa4) [0x563861659ac4] > 18: (MDSRank::ProgressThread::entry()+0x3d) [0x56386165a13d] > 19: (()+0x7ea5) [0x7fc0b0b7aea5] > 20: (clone()+0x6d) [0x7fc0af8288dd] > > 0> 2022-06-08 10:02:59.413 7fc0a479d700 -1 *** Caught signal > (Aborted) ** > in thread 7fc0a479d700 thread_name:mds_rank_progr > > ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) > nautilus (stable) > 1: (()+0xf630) [0x7fc0b0b82630] > 2: (gsignal()+0x37) [0x7fc0af760387] > 3: (abort()+0x148) [0x7fc0af761a78] > 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x199) [0x7fc0b2cbb8c4] > 5: (()+0x253a3d) [0x7fc0b2cbba3d] > 6: (()+0x20f84e) [0x56386170d84e] > 7: (Server::prepare_new_inode(boost::intrusive_ptr<MDRequestImpl>&, > CDir*, inodeno_t, unsigned int, file_layout_t*)+0x4ad) [0x5638616a0f1d] > 8: > (Server::handle_client_openc(boost::intrusive_ptr<MDRequestImpl>&)+0xcf1) > [0x5638616b0a91] > 9: > (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0xb5b) > > [0x5638616d78fb] > 10: (Server::handle_client_request(boost::intrusive_ptr<MClientRequest > const> const&)+0x2f8) [0x5638616d7d78] > 11: (Server::dispatch(boost::intrusive_ptr<Message const> > const&)+0x122) [0x5638616e3722] > 12: (MDSRank::handle_deferrable_message(boost::intrusive_ptr<Message > const> const&)+0x6dc) [0x5638616585ec] > 13: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, > bool)+0x7ea) [0x56386165aa4a] > 14: (MDSRank::retry_dispatch(boost::intrusive_ptr<Message const> > const&)+0x12) [0x56386165af72] > 15: (MDSContext::complete(int)+0x74) [0x5638618c9ce4] > 16: (MDSRank::_advance_queues()+0xa4) [0x563861659ac4] > 17: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, > bool)+0x1d0) [0x56386165a430] > 18: (MDSRank::retry_dispatch(boost::intrusive_ptr<Message const> > const&)+0x12) [0x56386165af72] > 19: (MDSContext::complete(int)+0x74) [0x5638618c9ce4] > 20: (MDSRank::_advance_queues()+0xa4) [0x563861659ac4] > 21: (MDSRank::ProgressThread::entry()+0x3d) [0x56386165a13d] > 22: (()+0x7ea5) [0x7fc0b0b7aea5] > 23: (clone()+0x6d) [0x7fc0af8288dd] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is > needed to interpret this. > <snip> > > > Thanks > > -Dave > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx