Understanding reshard issues

Graham Allan <gta@xxxxxxx> · Wed, 13 Dec 2017 13:50:53 -0600

After our Jewel to Luminous 12.2.2 upgrade, I ran into some of the same 
issues reported earlier on the list under "rgw resharding operation 
seemingly won't end". Some buckets were automatically added to the 
reshard list, and something happened overnight such that they couldn't 
be written to. A couple of our radosgw nodes hung due to inadequate 
limits on file handles, so that might possibly have been a cause.

I was able to correct the buckets using "radosgw-admin bucket check 
--fix" command, and later disabled the auto resharding.

As an experiment, I selected an unsharded bucket to attempt a manual 
reshard. I added it the reshard list ,then ran "radosgw-admin reshard 
execute". The bucket in question contains 184000 objects and was being 
converted from 1 to 3 shards.

I'm trying to understand what I found...

1) the "radosgw-admin reshard execute" never returned. Somehow I 
expected it to kick off a background operation, but possibly this was 
mistaken.

2) After 2 days it was still running. Is there any way to check 
progress? Such as querying something about the "new_bucket_instance_id" 
reported by "reshard status"?

3) When I tested uploading an object to the bucket I got an error - the 
client reported response code "UnknownError" - while radosgw logged:

2017-12-13 10:56:44.486131 7f02b2985700  0 block_while_resharding ERROR: bucket is still resharding, please retry
2017-12-13 10:56:44.488657 7f02b2985700  0 NOTICE: resharding operation on bucket index detected, blocking

But the introduction to dynamic resharding says that "there is no need 
to stop IO operations that go to the bucket (although some concurrent 
operations may experience additional latency when resharding is in 
progress)" - so I feel sure something must be wrong here.

I'd like to get a feel for how long it might take to reshard a smallish 
bucket of this sort, and whether it can be done without making it 
unwriteable, before considering how to handle our older and more 
pathological buckets (multi-million objects in a single shard).

Thanks for any pointers,

Graham
--
Graham Allan
Minnesota Supercomputing Institute - gta@xxxxxxx
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com