Hi!
After having a completely broken radosgw setup due to damaged buckets, I
completely deleted all rgw pools, and started from scratch.
But my problem is reproducible. After pushing ca. 100000 objects into a
bucket, the resharding process appears to start, and the bucket is now
unresponsive.
I just see lots of these messages in all rgw logs:
2018-01-15 16:57:45.108826 7fd1779b1700 0 block_while_resharding ERROR:
bucket is still resharding, please retry
2018-01-15 16:57:45.119184 7fd1779b1700 0 NOTICE: resharding operation
on bucket index detected, blocking
2018-01-15 16:57:45.260751 7fd1120e6700 0 block_while_resharding ERROR:
bucket is still resharding, please retry
2018-01-15 16:57:45.280410 7fd1120e6700 0 NOTICE: resharding operation
on bucket index detected, blocking
2018-01-15 16:57:45.300775 7fd15b979700 0 block_while_resharding ERROR:
bucket is still resharding, please retry
2018-01-15 16:57:45.300971 7fd15b979700 0 WARNING: set_req_state_err
err_no=2300 resorting to 500
2018-01-15 16:57:45.301042 7fd15b979700 0 ERROR:
RESTFUL_IO(s)->complete_header() returned err=Input/output error
One radosgw process and two OSDs housing the bucket index/metadata are
still busy, but it seems to be stuck again.
How long is this resharding process supposed to take? I cannot believe
that an application is supposed to block for more than half an hour...
I feel inclined to open a bug report, but I am yet unshure where the
problem lies.
Some information:
* 3 RGW processes, 3 OSD hosts with 12 HDD OSDs and 6 SSD OSDs
* Ceph 12.2.2
* Auto-Resharding on, Bucket Versioning & Lifecycle rule enabled.
Thanks,
Martin
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com