max_bucket limit -- safe to disable?

greg@xxxxxxxxxxx (Gregory Farnum) · Tue, 9 Sep 2014 12:43:31 -0700



On Tue, Sep 9, 2014 at 9:11 AM, Daniel Schneller
<daniel.schneller at centerdevice.com> wrote:
> Hi list!
>
> Under http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-September/033670.html
> I found a situation not unlike ours, but unfortunately either
> the list archive fails me or the discussion ended without a
> conclusion, so I dare to ask again :)
>
> We currently have a setup of 4 servers with 12 OSDs each,
> combined journal and data. No SSDs.
>
> We develop a document management application that accepts user
> uploads of all kinds of documents and processes them in several
> ways. For any given document, we might create anywhere from 10s
> to several hundred dependent artifacts.
>
> We are now preparing to move from Gluster to a Ceph based
> backend. The application uses the Apache JClouds Library to
> talk to the Rados Gateways that are running on all 4 of these
> machines, load balanced by haproxy.
>
> We currently intend to create one container for each document
> and put all the dependent and derived artifacts as objects into
> that container.
> This gives us a nice compartmentalization per document, also
> making it easy to remove a document and everything that is
> connected with it.
>
> During the first test runs we ran into the default limit of
> 1000 containers per user. In the thread mentioned above that
> limit was removed (setting the max_buckets value to 0). We did
> that and now can upload more than 1000 documents.
>
> I just would like to understand
>
> a) if this design is recommended, or if there are reasons to go
>    about the whole issue in a different way, potentially giving
>    up the benefit of having all document artifacts under one
>    convenient handle.
>
> b) is there any absolute limit for max_buckets that we will run
>    into? Remember we are talking about 10s of millions of
>    containers over time.
>
> c) are any performance issues to be expected with this design
>    and can we tune any parameters to alleviate this?
>
> Any feedback would be very much appreciated.

Yehuda can talk about this with more expertise than I can, but I think
it should be basically fine. By creating so many buckets you're
decreasing the effectiveness of RGW's metadata caching, which means
the initial lookup in a particular bucket might take longer.
The big concern is that we do maintain a per-user list of all their
buckets ? which is stored in a single RADOS object ? so if you have an
extreme number of buckets that RADOS object could get pretty big and
become a bottleneck when creating/removing/listing the buckets. You
should run your own experiments to figure out what the limits are
there; perhaps you have an easy way of sharding up documents into
different users.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com