Thanks Enrico, We are only syncing metadata between sites, so I don't think that bug will be the cause of our issues. I have been able to delete ~30k objects without causing the RGW to stop processing. Thanks Iain ________________________________ From: Enrico Bocchi <enrico.bocchi@xxxxxxx> Sent: 22 May 2024 13:48 To: Iain Stott <Iain.Stott@xxxxxxx>; ceph-users@xxxxxxx <ceph-users@xxxxxxx> Subject: Re: Reef RGWs stop processing requests CAUTION: This email originates from outside THG Hi Iain, Can you check if it relates to this? -- https://tracker.ceph.com/issues/63373<https://tracker.ceph.com/issues/63373> There is a bug when bulk deleting objects, causing the RGWs to deadlock. Cheers, Enrico On 5/17/24 11:24, Iain Stott wrote: > Hi, > > We are running 3 clusters in multisite. All 3 were running Quincy 17.2.6 and using cephadm. We upgraded one of the secondary sites to Reef 18.2.1 a couple of weeks ago and were planning on doing the rest shortly afterwards. > > We run 3 RGW daemons on separate physical hosts behind an external HAProxy HA pair for each cluster. > > Since we upgrade to Reef we have had issues with the RGWs stopping processing requests. We can see that they don't crash as they still have entries in the logs about syncing, but as far as request processing goes, they just stop. While debugging this we have 1 of the 3 RGWs running a Quincy image, and this has never had an issue where it stops processing requests. Any Reef containers we deploy have always stopped within 48Hrs of being deployed. We have tried Reef versions 18.2.1, 18.2.2 and 18.1.3 and all exhibit the same issue. We are running podman 4.6.1 on Centos 8 with kernel 4.18.0-513.24.1.el8_9.x86_64. > > We have enabled debug logs for the RGWs but we have been unable to find anything in them that would shed light on the cause. > > We are just wondering if anyone had any ideas on what could be causing this or how to debug it further? > > Thanks > Iain > > Iain Stott > OpenStack Engineer > Iain.Stott@xxxxxxx > [THG Ingenuity Logo]<https://www.thg.com<https://www.thg.com>> > www.thg.com<http://www.thg.com><https://www.thg.com/<https://www.thg.com/>> > [LinkedIn]<https://www.linkedin.com/company/thgplc/?originalSubdomain=uk<https://www.linkedin.com/company/thgplc/?originalSubdomain=uk>> [Instagram] <https://www.instagram.com/thg<https://www.instagram.com/thg>> [X] <https://twitter.com/thgplc?lang=en<https://twitter.com/thgplc?lang=en>> > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx -- Enrico Bocchi CERN European Laboratory for Particle Physics IT - Storage & Data Management - General Storage Services Mailbox: G20500 - Office: 31-2-010 1211 Genève 23 Switzerland _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx