Hi Christian, Dynamic bucket-index sharding for multi-site setups is being worked on, and will land in the N release cycle. regards, Matt On Sun, Apr 7, 2019 at 6:59 PM Christian Balzer <chibi@xxxxxxx> wrote: > > On Fri, 5 Apr 2019 11:42:28 -0400 Casey Bodley wrote: > > > Hi Iain, > > > > Resharding is not supported in multisite. The issue is that the master zone > > needs to be authoritative for all metadata. If bucket reshard commands run > > on the secondary zone, they create new bucket instance metadata that the > > master zone never sees, so replication can't reconcile those changes. > > > > Unless the above should read "dynamic resharding..." this is in clear > contrast to the documentation by Redhat Iain cited. > > But given how costly manual resharding is including service interruption, > that's not really a option for most people either. > > Looks like Ceph is out of the race for multi-PB use case here, unless > multi-site and dynamic resharding are less than 6 months away. > > Regards, > > Christian > > > The 'stale-instances rm' command is not safe to run in multisite because it > > can misidentify as 'stale' some bucket instances that were deleted on the > > master zone, where data sync on the secondary zone hasn't yet finished > > deleting all of the objects it contained. Deleting these bucket instances > > and their associated bucket index objects would leave any remaining objects > > behind as orphans and leak storage capacity. > > > > On Thu, Apr 4, 2019 at 3:28 PM Iain Buclaw <ibuclaw@xxxxxxxxxx> wrote: > > > > > On Wed, 3 Apr 2019 at 09:41, Iain Buclaw <ibuclaw@xxxxxxxxxx> wrote: > > > > > > > > On Tue, 19 Feb 2019 at 10:11, Iain Buclaw <ibuclaw@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > # ./radosgw-gc-bucket-indexes.sh master.rgw.buckets.index | wc -l > > > > > 7511 > > > > > > > > > > # ./radosgw-gc-bucket-indexes.sh secondary1.rgw.buckets.index | wc -l > > > > > 3509 > > > > > > > > > > # ./radosgw-gc-bucket-indexes.sh secondary2.rgw.buckets.index | wc -l > > > > > 3801 > > > > > > > > > > > > > Documentation is a horrid mess around the subject on multi-site > > > resharding > > > > > > > > > > > http://docs.ceph.com/docs/mimic/radosgw/dynamicresharding/#manual-bucket-resharding > > > > > > > > > > > https://www.suse.com/documentation/suse-enterprise-storage-5/book_storage_admin/data/ogw_bucket_sharding.html > > > > (Manual Resharding) > > > > > > > > > > > https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html-single/object_gateway_guide_for_red_hat_enterprise_linux/index#manually-resharding-buckets-with-multisite-rgw > > > > > > > > All disagree with each other over the correct process to reshard > > > > indexes in multi-site. Worse, none of them seem to work correctly > > > > anyway. > > > > > > > > Changelog of 13.2.5 looked promising up until the sentence: "These > > > > commands should not be used on a multisite setup as the stale > > > > instances may be unlikely to be from a reshard and can have > > > > consequences". > > > > > > > > http://docs.ceph.com/docs/master/releases/mimic/#v13-2-5-mimic > > > > > > > > > > The stale-instances feature only correctly identifies one stale shard. > > > > > > # radosgw-admin reshard stale-instances list > > > [ > > > "mybucket:0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1" > > > ] > > > > > > I can confirm this is one of the orphaned index objects. > > > > > > # rados -p .rgw.buckets.index ls | grep > > > 0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1 > > > .dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.0 > > > .dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.3 > > > .dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.9 > > > .dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.5 > > > .dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.2 > > > .dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.7 > > > .dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.1 > > > .dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.10 > > > .dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.4 > > > .dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.6 > > > .dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.11 > > > .dir.0ef1a91a-4aee-427e-bdf8-30589abb2d3e.97248676.1.8 > > > > > > I would assume then that unlike what documentation says, it's safe to > > > run 'reshard stale-instances rm' on a multi-site setup. > > > > > > However it is quite telling if the author of this feature doesn't > > > trust what they have written to work correctly. > > > > > > There are still thousands of stale index objects that 'stale-instances > > > list' didn't pick up though. But it appears that radosgw-admin only > > > looks at 'metadata list bucket' data, and not what is physically > > > inside the pool. > > > > > > -- > > > Iain Buclaw > > > > > > *(p < e ? p++ : p) = (c & 0x0f) + '0'; > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users@xxxxxxxxxxxxxx > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > -- > Christian Balzer Network/Systems Engineer > chibi@xxxxxxx Rakuten Communications > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- Matt Benjamin Red Hat, Inc. 315 West Huron Street, Suite 140A Ann Arbor, Michigan 48103 http://www.redhat.com/en/technologies/storage tel. 734-821-5101 fax. 734-769-8938 cel. 734-216-5309 _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com