Re: Ceph RGW performance guidelines

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> Hello Ceph Community!
> 
> I have the following very interesting problem, for which I found no clear
> guidelines upstream so I am hoping to get some input from the mailing list.
> I have a 6PB cluster in operation which is currently half full. The cluster
> has around 1K OSD, and the RGW data pool  has 4096 pgs (and pgp_num).

Even without specifics I can tell you that pg_num is waaaaaaaaaaaaaay too low.

Please send

`ceph -s`
`ceph osd tree | head -30`
`ceph osd df | head -10`
`ceph -v`

Also, tell us what media your index and bucket OSDs are on.

> The issue is as follows:
> Let's say that we have 10 million small objects (4MB) each.

In RGW terms, those are large objects.  Small objects would be 4KB.

> 1)Is there a performance difference *when fetching* between storing all 10
> million objects in one bucket and storing 1 million in 10 buckets?

Larger buckets will generally be slower for some things, but if you’re on Reef, and your bucket wasn’t created on an older release, 10 million shouldn’t be too bad.  Listing larger buckets will always be increasingly slower.  

> There
> should be "some" because of the different number of pgs in use, in the 2
> scenarios but it is very hard to quantify.
> 
> 2) What if I have 100 million objects? Is there some theoretical limit /
> guideline on the number of objects that I should have in a bucket before I
> see performance drops?

At that point, you might consider indexless buckets, if your client/application can keep track of objects in its own DB.

With dynamic sharding (assuming you have it enabled), RGW defaults to 100,000 objects per shard and 1999 max shards, so I *think* that after 199M objects in a bucket it won’t auto-reshard.

> I should mention here that the contents of the bucket *never need to be
> listed, *The user always knows how to do a curl, to get the contents.

We can most likely improve your config, but you may also be a candidate for an indexless bucket.  They don’t get a lot of press, and I won’t claim to be expert in them, but it’s something to look into.


> 
> Thank you for your help,
> Harry
> 
> P.S.
> The following URLs have been very informative, but they do not answer my
> question unfortunately.
> 
> https://www.redhat.com/en/blog/red-hat-ceph-object-store-dell-emc-servers-part-1
> https://www.redhat.com/en/blog/scaling-ceph-billion-objects-and-beyond
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux