Re: Workload Separation in Ceph RGW Cluster - Recommended or Not?

Ramin Najjarbashi <ramin.najarbashi@xxxxxxxxx> · Tue, 6 Jun 2023 22:06:20 +0330

Thank you for your response and for raising an important question regarding
the potential bottlenecks within the RGW or the overall Ceph cluster. I
appreciate your insight and would like to provide more information about
the issues I have been experiencing. In my deployment, RGW instances 17-20
have been encountering problems such as hanging or returning errors,
including "failed to read header: The socket was closed due to a timeout"
and "res_query() failed." These issues have led to disruptions and
congestions within the cluster. The index pool is indeed placed on a large
number of NVMe SSDs to ensure fast access and efficient indexing of data.
The number of Placement Groups (PGs) allocated for the index pool is also
configured to be sufficient for the workload

On Tue, Jun 6, 2023 at 21:27 Anthony D'Atri <anthony.datri@xxxxxxxxx> wrote:

> Do you have reason to believe that your bottlenecks are within RGW not
> within the cluster?
>
> e.g. is your index pool on a large number of NVMe SSDs with sufficient
> PGs? Is your bucket data on SSD as well?
>
>
> On Jun 6, 2023, at 13:52, Ramin Najjarbashi <ramin.najarbashi@xxxxxxxxx>
> wrote:
>
> I would like to seek your insights and recommendations regarding the
> practice of workload separation in a Ceph RGW (RADOS Gateway) cluster. I
> have been facing challenges with large queues in my deployment and would
> appreciate your expertise in determining whether workload separation is a
> recommended approach or not.
>
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx