Re: RGW: Reshard index of non-master zones in multi-site

Iain Buclaw <ibuclaw@xxxxxxxxxx> · Thu, 4 Apr 2019 21:27:38 +0200

On Wed, 3 Apr 2019 at 09:41, Iain Buclaw <ibuclaw@xxxxxxxxxx> wrote:
>
> On Tue, 19 Feb 2019 at 10:11, Iain Buclaw <ibuclaw@xxxxxxxxxx> wrote:
> >
> >
> > # ./radosgw-gc-bucket-indexes.sh master.rgw.buckets.index | wc -l
> > 7511
> >
> > # ./radosgw-gc-bucket-indexes.sh secondary1.rgw.buckets.index | wc -l
> > 3509
> >
> > # ./radosgw-gc-bucket-indexes.sh secondary2.rgw.buckets.index | wc -l
> > 3801
> >
>
> Documentation is a horrid mess around the subject on multi-site resharding
>
> http://docs.ceph.com/docs/mimic/radosgw/dynamicresharding/#manual-bucket-resharding
>
> https://www.suse.com/documentation/suse-enterprise-storage-5/book_storage_admin/data/ogw_bucket_sharding.html
> (Manual Resharding)
>
> https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html-single/object_gateway_guide_for_red_hat_enterprise_linux/index#manually-resharding-buckets-with-multisite-rgw
>
> All disagree with each other over the correct process to reshard
> indexes in multi-site.  Worse, none of them seem to work correctly
> anyway.
>
> Changelog of 13.2.5 looked promising up until the sentence: "These
> commands should not be used on a multisite setup as the stale
> instances may be unlikely to be from a reshard and can have
> consequences".
>
> http://docs.ceph.com/docs/master/releases/mimic/#v13-2-5-mimic
>

The stale-instances feature only correctly identifies one stale shard.

# radosgw-admin reshard stale-instances list
[
    "mybucket:0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1"
]

I can confirm this is one of the orphaned index objects.

# rados -p .rgw.buckets.index ls | grep
0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1
.dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.0
.dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.3
.dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.9
.dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.5
.dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.2
.dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.7
.dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.1
.dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.10
.dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.4
.dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.6
.dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.11
.dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.8

I would assume then that unlike what documentation says, it's safe to
run 'reshard stale-instances rm' on a multi-site setup.

However it is quite telling if the author of this feature doesn't
trust what they have written to work correctly.

There are still thousands of stale index objects that 'stale-instances
list' didn't pick up though.  But it appears that radosgw-admin only
looks at 'metadata list bucket' data, and not what is physically
inside the pool.

-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com