max_bucket limit -- safe to disable?

daniel.schneller@xxxxxxxxxxxxxxxx (Daniel Schneller) · Wed, 10 Sep 2014 13:18:51 +0200

On 09 Sep 2014, at 21:43, Gregory Farnum <greg at inktank.com> wrote:

> Yehuda can talk about this with more expertise than I can, but I think
> it should be basically fine. By creating so many buckets you're
> decreasing the effectiveness of RGW's metadata caching, which means
> the initial lookup in a particular bucket might take longer.

Thanks for your thoughts. With ?initial lookup in a particular bucket?
do you mean accessing any of the objects in a bucket? If we directly
access the object (not enumerating the buckets content), would that
still be an issue?
Just trying to understand the inner workings a bit better to make
more educated guesses :)

> The big concern is that we do maintain a per-user list of all their
> buckets ? which is stored in a single RADOS object ? so if you have an
> extreme number of buckets that RADOS object could get pretty big and
> become a bottleneck when creating/removing/listing the buckets. You

Alright. Listing buckets is no problem, that we don?t do. Can you
say what ?pretty big? would be in terms of MB? How much space does a
bucket record consume in there? Based on that I could run a few numbers.

> should run your own experiments to figure out what the limits are
> there; perhaps you have an easy way of sharding up documents into
> different users.

Good advice. We can do that per distributor (an org unit in our
software) to at least compartmentalize any potential locking issues
in this area to that single entity. Still, there would be quite
a lot of buckets/objects per distributor, so some more detail on
the above items would be great.

Thanks a lot!

Daniel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140910/ae0ed477/attachment.htm>