Hi,
can you verify if all images are readable? Maybe there's a corrupt
journal for an image and it fails to read it? Just a wild guess, I
can't really interpret the stack trace. Or are there some images
without journaling enabled or something? Are there some logs
available, maybe even debug logs where you could find the responsible
image? Is this the first run of the rbd-mirror or has it worked before?
Regards,
Eugen
Zitat von armsby@xxxxxxxxx:
Hi everyone,
I've been running rbd-mirror between my old Ceph system (16.2.10)
and my new system (18.2.2). I'm using journaling mode on a pool that
contains 7,500 images. Everything was running perfectly until it
processed about 5,608 images. Now, it keeps crashing with the
following message:
2024-07-19T05:49:32.425+0000 7f582b3fd6c0 0 set uid:gid to 167:167
(ceph:ceph)
2024-07-19T05:49:32.425+0000 7f582b3fd6c0 0 ceph version 16.2.10
(45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable), process
rbd-mirror, pid 7
2024-07-19T05:49:32.429+0000 7f582b3fd6c0 1 mgrc
service_daemon_register rbd-mirror.3606956688 metadata {arch=x86_64,
ceph_release=pacific, ceph_version=ceph version 16.2.10
(45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable),
ceph_version_short=16.2.10, container_hostname=mon-001,
container_image=quay.io/ceph/ceph@sha256:2b68483bcd050472a18e73389c0e1f3f70d34bb7abf733f692e88c935ea0a6bd, cpu=Intel(R) Xeon(R) Gold 6134 CPU @ 3.20GHz, distro=centos, distro_description=CentOS Stream 8, distro_version=8, hostname=mon-001, id=mon-001.lcqrti, instance_id=3606956688, kernel_description=#1 SMP Mon Jul 18 17:42:52 UTC 2022, kernel_version=4.18.0-408.el8.x86_64, mem_swap_kb=4194300, mem_total_kb=131393360,
os=Linux}
2024-07-19T05:50:28.305+0000 7f5812582700 -1
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/common/Thread.cc: In function 'void Thread::create(const char*, size_t)' thread 7f5812582700 time 2024-07-19T05:50:28.303536+0000 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/common/Thread.cc: 165: FAILED ceph_assert(ret ==
0)
ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17)
pacific (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x158) [0x7f58218b6de8]
2: /usr/lib64/ceph/libceph-common.so.2(+0x277002) [0x7f58218b7002]
3: /usr/lib64/ceph/libceph-common.so.2(+0x362fd7) [0x7f58219a2fd7]
4: (CommonSafeTimer<std::mutex>::init()+0x1fe) [0x7f58219a963e]
5:
(journal::Journaler::Threads::Threads(ceph::common::CephContext*)+0x2fc)
[0x55c9b33c6ddc]
6: (journal::Journaler::Journaler(librados::v14_2_0::IoCtx&,
std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > const&, std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> > const&,
journal::Settings const&, journal::CacheManagerHandler*)+0x50)
[0x55c9b33c6f10]
7:
(librbd::Journal<librbd::ImageCtx>::get_tag_owner(librados::v14_2_0::IoCtx&,
std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >&, std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >*,
librbd::asio::ContextWQ*, Context*)+0x19f) [0x55c9b2fa65af]
8:
(librbd::mirror::GetInfoRequest<librbd::ImageCtx>::get_journal_tag_owner()+0x210)
[0x55c9b31869f0]
9:
(librbd::mirror::GetInfoRequest<librbd::ImageCtx>::handle_get_mirror_image(int)+0x8c8)
[0x55c9b3189d78]
10: /lib64/librados.so.2(+0xa8546) [0x7f582aedb546]
11: /lib64/librados.so.2(+0xc17e5) [0x7f582aef47e5]
12: /lib64/librados.so.2(+0xc3742) [0x7f582aef6742]
13: /lib64/librados.so.2(+0xc914a) [0x7f582aefc14a]
14: /lib64/libstdc++.so.6(+0xc2ba3) [0x7f581fb03ba3]
15: /lib64/libpthread.so.0(+0x81ca) [0x7f5820cec1ca]
16: clone()
Has anyone encountered a similar problem or have any insight into
what might be causing this crash?
Thanks in advance for your help.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx