Ceph RGW performance guidelines

Harry Kominos <hkominos@xxxxxxxxx> · Tue, 15 Oct 2024 13:11:29 +0200

Hello Ceph Community!

I have the following very interesting problem, for which I found no clear
guidelines upstream so I am hoping to get some input from the mailing list.
I have a 6PB cluster in operation which is currently half full. The cluster
has around 1K OSD, and the RGW data pool  has 4096 pgs (and pgp_num).

The issue is as follows:
Let's say that we have 10 million small objects (4MB) each.

1)Is there a performance difference *when fetching* between storing all 10
million objects in one bucket and storing 1 million in 10 buckets? There
should be "some" because of the different number of pgs in use, in the 2
scenarios but it is very hard to quantify.

2) What if I have 100 million objects? Is there some theoretical limit /
guideline on the number of objects that I should have in a bucket before I
see performance drops?

I should mention here that the contents of the bucket *never need to be
listed, *The user always knows how to do a curl, to get the contents.

Thank you for your help,
Harry

P.S.
The following URLs have been very informative, but they do not answer my
question unfortunately.

https://www.redhat.com/en/blog/red-hat-ceph-object-store-dell-emc-servers-part-1
https://www.redhat.com/en/blog/scaling-ceph-billion-objects-and-beyond
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx