Hello,
We have cephfs with two active MDS. Currently rank 1 is repeatedly
crashing with FAILED ceph_assert(p->first <= start) in md_log_replay
thread. Is there any way to work around this and get to accesible
file system or should we start with disaster recovery?
It seems similar to https://tracker.ceph.com/issues/61009
Crash info:
{
"assert_condition": "p->first <= start",
"assert_file":
"/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/18.2.2/rpm/el9/BUILD/ceph-18.2.2/src/include/interval_set.h",
"assert_func": "void interval_set<T, C>::erase(T, T,
std::function<bool(T, T)>) [with T = inodeno_t; C = std::map]",
"assert_line": 568,
"assert_msg":
"/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/18.2.2/rpm/el9/BUILD/ceph-18.2.2/src/include/interval_set.h: In function 'void interval_set<T, C>::erase(T, T, std::function<bool(T, T)>) [with T = inodeno_t; C = std::map]' thread 7fcdaaf8a640 time 2024-05-08T00:26:22.049974+0200\n/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/18.2.2/rpm/el9/BUILD/ceph-18.2.2/src/include/interval_set.h: 568: FAILED ceph_assert(p->first <= start)\n",
"assert_thread_name": "md_log_replay",
"backtrace": [
"/lib64/libc.so.6(+0x54db0) [0x7fcdb7a54db0]",
"/lib64/libc.so.6(+0xa154c) [0x7fcdb7aa154c]",
"raise()",
"abort()",
"(ceph::__ceph_assert_fail(char const*, char const*, int,
char const*)+0x188) [0x7fcdb83610ff]",
"/usr/lib64/ceph/libceph-common.so.2(+0x161263)
[0x7fcdb8361263]",
"/usr/bin/ceph-mds(+0x1f3b0e) [0x55a5904a9b0e]",
"/usr/bin/ceph-mds(+0x1f3b55) [0x55a5904a9b55]",
"(EMetaBlob::replay(MDSRank*, LogSegment*, int,
MDPeerUpdate*)+0x4b9d) [0x55a5906e1c8d]",
"(EUpdate::replay(MDSRank*)+0x5d) [0x55a5906eacbd]",
"(MDLog::_replay_thread()+0x7a1) [0x55a590694af1]",
"/usr/bin/ceph-mds(+0x1460f1) [0x55a5903fc0f1]",
"/lib64/libc.so.6(+0x9f802) [0x7fcdb7a9f802]",
"/lib64/libc.so.6(+0x3f450) [0x7fcdb7a3f450]"
],
"ceph_version": "18.2.2",
"crash_id":
"2024-05-07T22:26:22.050652Z_8be89ffb-bb87-4832-9339-57f8bd29f766",
"entity_name": "mds.spod19",
"os_id": "almalinux",
"os_name": "AlmaLinux",
"os_version": "9.3 (Shamrock Pampas Cat)",
"os_version_id": "9.3",
"process_name": "ceph-mds",
"stack_sig":
"3d0a2ca9b3c7678bf69efc20fff42b588c63f8be1832e1e0c28c99bafc082c15",
"timestamp": "2024-05-07T22:26:22.050652Z",
"utsname_hostname": "spod19.ijs.si",
"utsname_machine": "x86_64",
"utsname_release": "5.14.0-362.8.1.el9_3.x86_64",
"utsname_sysname": "Linux",
"utsname_version": "#1 SMP PREEMPT_DYNAMIC Tue Nov 7 14:54:22
EST 2023"
}
Cheers,
Dejan
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx