Re: Reef RGWs stop processing requests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks Enrico,

We are only syncing metadata between sites, so I don't think that bug will be the cause of our issues.

I have been able to delete ~30k objects without causing the RGW to stop processing.


Thanks
Iain
________________________________
From: Enrico Bocchi <enrico.bocchi@xxxxxxx>
Sent: 22 May 2024 13:48
To: Iain Stott <Iain.Stott@xxxxxxx>; ceph-users@xxxxxxx <ceph-users@xxxxxxx>
Subject: Re:  Reef RGWs stop processing requests

CAUTION: This email originates from outside THG

Hi Iain,

Can you check if it relates to this? --
https://tracker.ceph.com/issues/63373<https://tracker.ceph.com/issues/63373>
There is a bug when bulk deleting objects, causing the RGWs to deadlock.

Cheers,
Enrico


On 5/17/24 11:24, Iain Stott wrote:
> Hi,
>
> We are running 3 clusters in multisite. All 3 were running Quincy 17.2.6 and using cephadm. We upgraded one of the secondary sites to Reef 18.2.1 a couple of weeks ago and were planning on doing the rest shortly afterwards.
>
> We run 3 RGW daemons on separate physical hosts behind an external HAProxy HA pair for each cluster.
>
> Since we upgrade to Reef we have had issues with the RGWs stopping processing requests. We can see that they don't crash as they still have entries in the logs about syncing, but as far as request processing goes, they just stop. While debugging this we have 1 of the 3 RGWs running a Quincy image, and this has never had an issue where it stops processing requests. Any Reef containers we deploy have always stopped within 48Hrs of being deployed. We have tried Reef versions 18.2.1, 18.2.2 and 18.1.3 and all exhibit the same issue. We are running podman 4.6.1 on Centos 8 with kernel 4.18.0-513.24.1.el8_9.x86_64.
>
> We have enabled debug logs for the RGWs but we have been unable to find anything in them that would shed light on the cause.
>
> We are just wondering if anyone had any ideas on what could be causing this or how to debug it further?
>
> Thanks
> Iain
>
> Iain Stott
> OpenStack Engineer
> Iain.Stott@xxxxxxx
> [THG Ingenuity Logo]<https://www.thg.com<https://www.thg.com>>
> www.thg.com<http://www.thg.com><https://www.thg.com/<https://www.thg.com/>>
> [LinkedIn]<https://www.linkedin.com/company/thgplc/?originalSubdomain=uk<https://www.linkedin.com/company/thgplc/?originalSubdomain=uk>> [Instagram] <https://www.instagram.com/thg<https://www.instagram.com/thg>> [X] <https://twitter.com/thgplc?lang=en<https://twitter.com/thgplc?lang=en>>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

--
Enrico Bocchi
CERN European Laboratory for Particle Physics
IT - Storage & Data Management - General Storage Services
Mailbox: G20500 - Office: 31-2-010
1211 Genève 23
Switzerland
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux