Re: Crashing MDS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Dave,

Just to make sure, have you checked if the host has free inode available?

On Wed, 8 Jun 2022 at 19:22, Dave Schulz <dschulz@xxxxxxxxxxx> wrote:

> Hi Everyone,
>
> I have an MDS server that's crashing moments after it starts. The
> filesystem is set to max_mds=5 and mds.[1-4] are all up and active but
> mds.0 keeps crashing.  all I can see is the following in the
> /var/log/ceph/ceph-mds..... logfile.  Any thoughts?
>
>
>      -2> 2022-06-08 10:02:59.408 7fc0a479d700  4 mds.0.server
> handle_client_request client_request(client.162790796:2313708455 create
> #0x500012ba51c/0192.jpg 2022-06-08 09:06:03.780237 RETRY=19
> caller_uid=10363898, caller_gid=10363898{10363898,}) v4
>      -1> 2022-06-08 10:02:59.410 7fc0a479d700 -1
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.9/rpm/el7/BUILD/ceph-14.2.9/src/mds/MDCache.cc:
>
> In function 'void MDCache::add_inode(CInode*)' thread 7fc0a479d700 time
> 2022-06-08 10:02:59.409711
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.9/rpm/el7/BUILD/ceph-14.2.9/src/mds/MDCache.cc:
>
> 279: FAILED ceph_assert(!p)
>
>   ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0)
> nautilus (stable)
>   1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x14a) [0x7fc0b2cbb875]
>   2: (()+0x253a3d) [0x7fc0b2cbba3d]
>   3: (()+0x20f84e) [0x56386170d84e]
>   4: (Server::prepare_new_inode(boost::intrusive_ptr<MDRequestImpl>&,
> CDir*, inodeno_t, unsigned int, file_layout_t*)+0x4ad) [0x5638616a0f1d]
>   5:
> (Server::handle_client_openc(boost::intrusive_ptr<MDRequestImpl>&)+0xcf1)
> [0x5638616b0a91]
>   6:
> (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0xb5b)
>
> [0x5638616d78fb]
>   7: (Server::handle_client_request(boost::intrusive_ptr<MClientRequest
> const> const&)+0x2f8) [0x5638616d7d78]
>   8: (Server::dispatch(boost::intrusive_ptr<Message const>
> const&)+0x122) [0x5638616e3722]
>   9: (MDSRank::handle_deferrable_message(boost::intrusive_ptr<Message
> const> const&)+0x6dc) [0x5638616585ec]
>   10: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&,
> bool)+0x7ea) [0x56386165aa4a]
>   11: (MDSRank::retry_dispatch(boost::intrusive_ptr<Message const>
> const&)+0x12) [0x56386165af72]
>   12: (MDSContext::complete(int)+0x74) [0x5638618c9ce4]
>   13: (MDSRank::_advance_queues()+0xa4) [0x563861659ac4]
>   14: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&,
> bool)+0x1d0) [0x56386165a430]
>   15: (MDSRank::retry_dispatch(boost::intrusive_ptr<Message const>
> const&)+0x12) [0x56386165af72]
>   16: (MDSContext::complete(int)+0x74) [0x5638618c9ce4]
>   17: (MDSRank::_advance_queues()+0xa4) [0x563861659ac4]
>   18: (MDSRank::ProgressThread::entry()+0x3d) [0x56386165a13d]
>   19: (()+0x7ea5) [0x7fc0b0b7aea5]
>   20: (clone()+0x6d) [0x7fc0af8288dd]
>
>       0> 2022-06-08 10:02:59.413 7fc0a479d700 -1 *** Caught signal
> (Aborted) **
>   in thread 7fc0a479d700 thread_name:mds_rank_progr
>
>   ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0)
> nautilus (stable)
>   1: (()+0xf630) [0x7fc0b0b82630]
>   2: (gsignal()+0x37) [0x7fc0af760387]
>   3: (abort()+0x148) [0x7fc0af761a78]
>   4: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x199) [0x7fc0b2cbb8c4]
>   5: (()+0x253a3d) [0x7fc0b2cbba3d]
>   6: (()+0x20f84e) [0x56386170d84e]
>   7: (Server::prepare_new_inode(boost::intrusive_ptr<MDRequestImpl>&,
> CDir*, inodeno_t, unsigned int, file_layout_t*)+0x4ad) [0x5638616a0f1d]
>   8:
> (Server::handle_client_openc(boost::intrusive_ptr<MDRequestImpl>&)+0xcf1)
> [0x5638616b0a91]
>   9:
> (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0xb5b)
>
> [0x5638616d78fb]
>   10: (Server::handle_client_request(boost::intrusive_ptr<MClientRequest
> const> const&)+0x2f8) [0x5638616d7d78]
>   11: (Server::dispatch(boost::intrusive_ptr<Message const>
> const&)+0x122) [0x5638616e3722]
>   12: (MDSRank::handle_deferrable_message(boost::intrusive_ptr<Message
> const> const&)+0x6dc) [0x5638616585ec]
>   13: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&,
> bool)+0x7ea) [0x56386165aa4a]
>   14: (MDSRank::retry_dispatch(boost::intrusive_ptr<Message const>
> const&)+0x12) [0x56386165af72]
>   15: (MDSContext::complete(int)+0x74) [0x5638618c9ce4]
>   16: (MDSRank::_advance_queues()+0xa4) [0x563861659ac4]
>   17: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&,
> bool)+0x1d0) [0x56386165a430]
>   18: (MDSRank::retry_dispatch(boost::intrusive_ptr<Message const>
> const&)+0x12) [0x56386165af72]
>   19: (MDSContext::complete(int)+0x74) [0x5638618c9ce4]
>   20: (MDSRank::_advance_queues()+0xa4) [0x563861659ac4]
>   21: (MDSRank::ProgressThread::entry()+0x3d) [0x56386165a13d]
>   22: (()+0x7ea5) [0x7fc0b0b7aea5]
>   23: (clone()+0x6d) [0x7fc0af8288dd]
>   NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> needed to interpret this.
> <snip>
>
>
> Thanks
>
> -Dave
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux