Re: rbd-mirror keeps crashing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

can you verify if all images are readable? Maybe there's a corrupt journal for an image and it fails to read it? Just a wild guess, I can't really interpret the stack trace. Or are there some images without journaling enabled or something? Are there some logs available, maybe even debug logs where you could find the responsible image? Is this the first run of the rbd-mirror or has it worked before?

Regards,
Eugen

Zitat von armsby@xxxxxxxxx:

Hi everyone,

I've been running rbd-mirror between my old Ceph system (16.2.10) and my new system (18.2.2). I'm using journaling mode on a pool that contains 7,500 images. Everything was running perfectly until it processed about 5,608 images. Now, it keeps crashing with the following message:

2024-07-19T05:49:32.425+0000 7f582b3fd6c0 0 set uid:gid to 167:167 (ceph:ceph) 2024-07-19T05:49:32.425+0000 7f582b3fd6c0 0 ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable), process rbd-mirror, pid 7 2024-07-19T05:49:32.429+0000 7f582b3fd6c0 1 mgrc service_daemon_register rbd-mirror.3606956688 metadata {arch=x86_64, ceph_release=pacific, ceph_version=ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable), ceph_version_short=16.2.10, container_hostname=mon-001, container_image=quay.io/ceph/ceph@sha256:2b68483bcd050472a18e73389c0e1f3f70d34bb7abf733f692e88c935ea0a6bd, cpu=Intel(R) Xeon(R) Gold 6134 CPU @ 3.20GHz, distro=centos, distro_description=CentOS Stream 8, distro_version=8, hostname=mon-001, id=mon-001.lcqrti, instance_id=3606956688, kernel_description=#1 SMP Mon Jul 18 17:42:52 UTC 2022, kernel_version=4.18.0-408.el8.x86_64, mem_swap_kb=4194300, mem_total_kb=131393360, os=Linux} 2024-07-19T05:50:28.305+0000 7f5812582700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/common/Thread.cc: In function 'void Thread::create(const char*, size_t)' thread 7f5812582700 time 2024-07-19T05:50:28.303536+0000 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/common/Thread.cc: 165: FAILED ceph_assert(ret == 0)

ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f58218b6de8]
 2: /usr/lib64/ceph/libceph-common.so.2(+0x277002) [0x7f58218b7002]
 3: /usr/lib64/ceph/libceph-common.so.2(+0x362fd7) [0x7f58219a2fd7]
 4: (CommonSafeTimer<std::mutex>::init()+0x1fe) [0x7f58219a963e]
5: (journal::Journaler::Threads::Threads(ceph::common::CephContext*)+0x2fc) [0x55c9b33c6ddc] 6: (journal::Journaler::Journaler(librados::v14_2_0::IoCtx&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, journal::Settings const&, journal::CacheManagerHandler*)+0x50) [0x55c9b33c6f10] 7: (librbd::Journal<librbd::ImageCtx>::get_tag_owner(librados::v14_2_0::IoCtx&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, librbd::asio::ContextWQ*, Context*)+0x19f) [0x55c9b2fa65af] 8: (librbd::mirror::GetInfoRequest<librbd::ImageCtx>::get_journal_tag_owner()+0x210) [0x55c9b31869f0] 9: (librbd::mirror::GetInfoRequest<librbd::ImageCtx>::handle_get_mirror_image(int)+0x8c8) [0x55c9b3189d78]
 10: /lib64/librados.so.2(+0xa8546) [0x7f582aedb546]
 11: /lib64/librados.so.2(+0xc17e5) [0x7f582aef47e5]
 12: /lib64/librados.so.2(+0xc3742) [0x7f582aef6742]
 13: /lib64/librados.so.2(+0xc914a) [0x7f582aefc14a]
 14: /lib64/libstdc++.so.6(+0xc2ba3) [0x7f581fb03ba3]
 15: /lib64/libpthread.so.0(+0x81ca) [0x7f5820cec1ca]
 16: clone()


Has anyone encountered a similar problem or have any insight into what might be causing this crash?

Thanks in advance for your help.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux