Hmm, looks like intended behaviour: <SNIP> CommitDate: Mon Mar 3 06:08:42 2014 -0800 worker: process all bucket instance log entries at once Currently if there are more than max_entries in a single bucket instance's log, only max_entries of those will be processed, and the bucket instance will not be examined again until it is modified again. To keep it simple, get the entire log of entries to be updated and process them all at once. This means one busy shard may block others from syncing, but multiple instances of radosgw-agent can be run to circumvent that issue. With only one instance, users can be sure everything is synced when an incremental sync completes with no errors. </SNIP> However, this brings us to a new issue. After starting a second agent, one of the agents is busy syncing the busy shard and the other agent synced correctly all of the other buckets. So far, so good. But, since a few of them are almost static, it looks like it started syncing those in a second run from the beginning all over again. As versioning was enabled on those buckets after they were created and with already objects and removed objects in there, it seems like the agent is copying those unversioned objects to versioned ones, creating a lot of delete markers and multiple versions in the secondary zone. Anyone any idea how to handle this correctly. I've already did a cleanup some weeks ago, but if the agent is going to keep on restarting the sync from the beginning, I'll have to cleanup every time. regards, Sam On 18-08-15 09:36, Sam Wouters wrote: > Hi, > > from the doc of radosgw-agent and some items in this list, I understood > that the max-entries argument was there to prevent a very active bucket > to keep the other buckets from keeping synced. In our agent logs however > we saw a lot of "bucket instance bla has 1000 entries after bla", and > the agent kept on syncing that active bucket. > > Looking at the code, in class DataWorkerIncremental, it looks like the > agent loops in fetching log entries from the bucket until it receives > less entries then the max_entries. Is this intended behaviour? I would > suspect it to just pass the "max_entries" log entries for processing and > increase the marker. > > Is there any other way to make sure less active buckets are frequently > synced? We've tried increasing num-workers, but this only has affect the > first pass. > > Thanks, > Sam > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com