Yes, 200 million is way too big for a single ceph RGW bucket. We encountered this problem early on and sharded our buckets into 20 buckets, each which have the sharded bucket index with 20 shards.
Unfortunately, enabling the sharded RGW index requires recreating the bucket and all objects.
The fact that ceph uses ceph itself for the bucket indexes makes RGW less reliable in our experience. Instead of depending on one object you're depending on two, with the index and the object itself. If the cluster has any issues with the index the fact that it blocks access to the object itself is very frustrating. If we could retrieve / put objects into RGW without hitting the index at all we would - we don't need to list our buckets.
-Ben
On Tue, Sep 20, 2016 at 1:57 AM, Wido den Hollander <wido@xxxxxxxx> wrote:
> Op 20 september 2016 om 10:55 schreef Василий Ангапов <angapov@xxxxxxxxx>:
>
>
> Hello,
>
> Is there any way to copy rgw bucket index to another Ceph node to
> lower the downtime of RGW? For now I have a huge bucket with 200
> million files and its backfilling is blocking RGW completely for an
> hour and a half even with 10G network.
>
No, not really. What you really want is the bucket sharding feature.
So what you can do is enable the sharding, create a NEW bucket and copy over the objects.
Afterwards you can remove the old bucket.
Wido
> Thanks!
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com