Re: rgw bucket index manual copy

Stas Starikevich <stas.starikevich@xxxxxxxxx> · Wed, 21 Sep 2016 21:02:47 -0400

Hi Ben,

Since the 'Jewel' RadosGW supports blind buckets.
To enable blind buckets configuration I used:

radosgw-admin zone get --rgw-zone=default > default-zone.json
#change index_type from 0 to 1
vi default-zone.json
radosgw-admin zone set --rgw-zone=default --infile default-zone.json

To apply changes you have to restart all the RGW daemons. Then all newly created buckets will not have index (bucket list will provide empty output), but GET\PUT works perfectly.
In my tests there is no performance difference between SSD-backed indexes and 'blind bucket' configuration.

Stas

> On Sep 21, 2016, at 2:26 PM, Ben Hines <bhines@xxxxxxxxx> wrote:
> 
> Nice, thanks! Must have missed that one. It might work well for our use case since we don't really need the index. 
> 
> -Ben
> 
> On Wed, Sep 21, 2016 at 11:23 AM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
> On Wednesday, September 21, 2016, Ben Hines <bhines@xxxxxxxxx> wrote:
> Yes, 200 million is way too big for a single ceph RGW bucket. We encountered this problem early on and sharded our buckets into 20 buckets, each which have the sharded bucket index with 20 shards.
> 
> Unfortunately, enabling the sharded RGW index requires recreating the bucket and all objects.
> 
> The fact that ceph uses ceph itself for the bucket indexes makes RGW less reliable in our experience. Instead of depending on one object you're depending on two, with the index and the object itself. If the cluster has any issues with the index the fact that it blocks access to the object itself is very frustrating. If we could retrieve / put objects into RGW without hitting the index at all we would - we don't need to list our buckets.
> 
> I don't know the details or which release it went into, but indexless buckets are now a thing -- check the release notes or search the lists! :)
> -Greg
> 
>  
> 
> -Ben
> 
> On Tue, Sep 20, 2016 at 1:57 AM, Wido den Hollander <wido@xxxxxxxx> wrote:
> 
> > Op 20 september 2016 om 10:55 schreef Василий Ангапов <angapov@xxxxxxxxx>:
> >
> >
> > Hello,
> >
> > Is there any way to copy rgw bucket index to another Ceph node to
> > lower the downtime of RGW? For now I have  a huge bucket with 200
> > million files and its backfilling is blocking RGW completely for an
> > hour and a half even with 10G network.
> >
> 
> No, not really. What you really want is the bucket sharding feature.
> 
> So what you can do is enable the sharding, create a NEW bucket and copy over the objects.
> 
> Afterwards you can remove the old bucket.
> 
> Wido
> 
> > Thanks!
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com