On Sat, Feb 10, 2024 at 10:05:02AM -0500, Vladimir Sigunov wrote: > Hello Community! > I would appreciate any help/suggestions with the massive RGWs outage we are > facing. > The cluster's overall status is acceptable (HEALTH_WARN because of some pgs > not scrubbed in time), and the cluster is operational. > However, all RGWs fail to start with a core dump. > The only issue I see at the moment is the RGW GC queue (radosgs-admin gc > list) that contains 600K records. > I believe this could be the root cause of the issue. When I pause OSD iops > (ceph osd pause), all RGWs starting with no issues. > There are no large OMAPs or any other warnings in ceph -s output. To get you going for the moment, how about disabling the GC threads in the RGW daemon, and then processing GC async. Add "rgw_enable_gc_threads=0" to ceph.conf. After that, testing to see why you get the dump; start up a seperate RGW instance with debug logging enabled. -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation President & Treasurer E-Mail : robbat2@xxxxxxxxxx GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx