Re: MDS crashes after evicting client session

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Patch for this has already been merged and backported to quincy as well. It
will be there in the next Quincy release.

On Thu, Sep 22, 2022 at 5:12 PM E Taka <0etaka0@xxxxxxxxx> wrote:

> Ceph 17.2.3 (dockerized in Ubuntu 20.04)
>
> The subject says it. The MDS process always crashes after evicting. ceph -w
> shows:
>
> 2022-09-22T13:26:23.305527+0200 mds.ksz-cephfs2.ceph00.kqjdwe [INF]
> Evicting (and blocklisting) client session 5181680 (
> 10.149.12.21:0/3369570791)
> 2022-09-22T13:26:35.729317+0200 mon.ceph00 [INF] daemon
> mds.ksz-cephfs2.ceph03.vsyrbk restarted
> 2022-09-22T13:26:36.039678+0200 mon.ceph00 [INF] daemon
> mds.ksz-cephfs2.ceph01.xybiqv restarted
> 2022-09-22T13:29:21.000392+0200 mds.ksz-cephfs2.ceph04.ekmqio [INF]
> Evicting (and blocklisting) client session 5249349 (
> 10.149.12.22:0/2459302619)
> 2022-09-22T13:29:32.069656+0200 mon.ceph00 [INF] daemon
> mds.ksz-cephfs2.ceph01.xybiqv restarted
> 2022-09-22T13:30:00.000101+0200 mon.ceph00 [INF] overall HEALTH_OK
> 2022-09-22T13:30:20.710271+0200 mon.ceph00 [WRN] Health check failed: 1
> daemons have recently crashed (RECENT_CRASH)
>
> The crash info of the crashed MDS is:
> # ceph crash info
> 2022-09-22T11:26:24.013274Z_b005f3fc-7704-4cfc-96c5-f2a9c993f166
> {
>    "assert_condition": "!mds->is_any_replay()",
>    "assert_file":
>
> "/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.3/rpm/el8/BUILD/ceph-17.2.3/src/mds/MDLog.cc",
>
>    "assert_func": "void MDLog::_submit_entry(LogEvent*,
> MDSLogContextBase*)",
>    "assert_line": 283,
>    "assert_msg":
>
> "/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.3/rpm/el8/BUILD/ceph-17.2.3/src/mds/MDLog.cc:
> In function 'void MDLog::_submit_entry(LogEvent*, MDSLogContextBase*)'
> thread 7f76fa8f6700 time
>
> 2022-09-22T11:26:23.992050+0000\n/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.3/rpm/el8/BUILD/ceph-17.2.3/src/mds/MDLog.cc:
> 283: FAILED ceph_assert(!mds->is_any_replay())\n",
>    "assert_thread_name": "ms_dispatch",
>    "backtrace": [
>        "/lib64/libpthread.so.0(+0x12ce0) [0x7f770231bce0]",
>        "gsignal()",
>        "abort()",
>        "(ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x1b0) [0x7f770333bcd2]",
>        "/usr/lib64/ceph/libceph-common.so.2(+0x283e95) [0x7f770333be95]",
>        "(MDLog::_submit_entry(LogEvent*, MDSLogContextBase*)+0x3f)
> [0x55991905efdf]",
>        "(Server::journal_close_session(Session*, int, Context*)+0x78c)
> [0x559918d7d63c]",
>        "(Server::kill_session(Session*, Context*)+0x212) [0x559918d7dd92]",
>        "(Server::apply_blocklist()+0x10d) [0x559918d7e04d]",
>        "(MDSRank::apply_blocklist(std::set<entity_addr_t,
> std::less<entity_addr_t>, std::allocator<entity_addr_t> > const&, unsigned
> int)+0x34) [0x559918d39d74]",
>        "(MDSRankDispatcher::handle_osd_map()+0xf6) [0x559918d3a0b6]",
>        "(MDSDaemon::handle_core_message(boost::intrusive_ptr<Message const>
> const&)+0x39b) [0x559918d2330b]",
>        "(MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message>
> const&)+0xc3) [0x559918d23cc3]",
>        "(DispatchQueue::entry()+0x14fa) [0x7f77035c240a]",
>        "(DispatchQueue::DispatchThread::entry()+0x11) [0x7f7703679481]",
>        "/lib64/libpthread.so.0(+0x81ca) [0x7f77023111ca]",
>        "clone()"
>    ],
>    "ceph_version": "17.2.3",
>    "crash_id":
> "2022-09-22T11:26:24.013274Z_b005f3fc-7704-4cfc-96c5-f2a9c993f166",
>    "entity_name": "mds.ksz-cephfs2.ceph03.vsyrbk",
>    "os_id": "centos",
>    "os_name": "CentOS Stream",
>    "os_version": "8",
>    "os_version_id": "8",
>    "process_name": "ceph-mds",
>    "stack_sig":
> "b75e46941b5f6b7c05a037f9af5d42bb19d82ab7fc6a3c168533fc31a42b4de8",
>    "timestamp": "2022-09-22T11:26:24.013274Z",
>    "utsname_hostname": "ceph03",
>    "utsname_machine": "x86_64",
>    "utsname_release": "5.4.0-125-generic",
>    "utsname_sysname": "Linux",
>    "utsname_version": "#141-Ubuntu SMP Wed Aug 10 13:42:03 UTC 2022"
> }
>
> (Don't be confused by the time information, "ceph -w" is UTC+2, "crash
> info" is UTC)
>
> Should I report this a bug or did I miss something which caused the error?
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>

-- 
*Dhairya Parmar*

He/Him/His

Associate Software Engineer, CephFS

Red Hat Inc. <https://www.redhat.com/>

dparmar@xxxxxxxxxx
<https://www.redhat.com/>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux