+ceph-devel
On Wed, Feb 2, 2022 at 10:56 PM Casey Bodley <cbodley@xxxxxxxxxx> wrote:
On Wed, Feb 2, 2022 at 8:36 AM Yuval Lifshitz <ylifshit@xxxxxxxxxx> wrote:
>
> i do see sync errors with "ERR_BUSY_RESHARDING": https://0x0.st/oH3D.json
> after dynamic reshard happened mid-sync, even though sync was finished successfully.
>
> is this expected?
those errors are possible, but i wouldn't say expected.
but shouldn't the errors get cleared after the objects were successfully synced?
if fetch_remote_obj() is returning this error, that seems to imply that
RGWRados::guard_reshard() retried the index operation
NUM_RESHARD_RETRIES=10 times and still found it locked for resharding.
and after each try, guard_reshard() calls
RGWRados::block_while_resharding(), which has its own retry loop with
num_retries=10 that polls the reshard status then sleeps 5 seconds
with reshard_wait->wait()
if my understanding is correct, that would mean that the successful
reshard took over ~500 seconds to complete? or something under
guard_reshard() isn't working right
it looks like there is a problem. when I look at the client that uploads the objects to the primary it gets stalled for about 10 seconds, while the reshard is happening. however, the 2ndary sync process is stalled for a much longer period, until it successfully syncs
_______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx