Re: radosgw-agent keeps syncing most active bucket - ignoring others

Sam Wouters <sam@xxxxxxxxx> · Tue, 18 Aug 2015 13:33:50 +0200

Hmm,

looks like intended behaviour:

<SNIP>
CommitDate: Mon Mar 3 06:08:42 2014 -0800

   worker: process all bucket instance log entries at once

 Currently if there are more than max_entries in a single bucket
   instance's log, only max_entries of those will be processed, and the
   bucket instance will not be examined again until it is modified again.

   To keep it simple, get the entire log of entries to be updated and
   process them all at once. This means one busy shard may block others
   from syncing, but multiple instances of radosgw-agent can be run to
   circumvent that issue. With only one instance, users can be sure
   everything is synced when an incremental sync completes with no
   errors.
</SNIP>

However, this brings us to a new issue. After starting a second agent,
one of the agents is busy syncing the busy shard and the other agent
synced correctly all of the other buckets. So far, so good. But, since a
few of them are almost static, it looks like it started syncing those in
a second run from the beginning all over again.
As versioning was enabled on those buckets after they were created and
with already objects and removed objects in there, it seems like the
agent is copying those unversioned objects to versioned ones, creating a
lot of delete markers and multiple versions in the secondary zone.

Anyone any idea how to handle this correctly. I've already did a cleanup
some weeks ago, but if the agent is going to keep on restarting the sync
from the beginning, I'll have to cleanup every time.

regards,
Sam

On 18-08-15 09:36, Sam Wouters wrote:
> Hi,
>
> from the doc of radosgw-agent and some items in this list, I understood
> that the max-entries argument was there to prevent a very active bucket
> to keep the other buckets from keeping synced. In our agent logs however
> we saw a lot of "bucket instance bla has 1000 entries after bla", and
> the agent kept on syncing that active bucket.
>
> Looking at the code, in class DataWorkerIncremental, it looks like the
> agent loops in fetching log entries from the bucket until it receives
> less entries then the max_entries. Is this intended behaviour? I would
> suspect it to just pass the "max_entries" log entries for processing and
> increase the marker.
>
> Is there any other way to make sure less active buckets are frequently
> synced? We've tried increasing num-workers, but this only has affect the
> first pass.
>
> Thanks,
> Sam
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com