On Wed, Feb 5, 2014 at 2:21 PM, Josh Durgin <josh.durgin@xxxxxxxxxxx> wrote: > On 02/05/2014 01:23 PM, Craig Lewis wrote: >> >> >> On 2/4/14 20:02 , Josh Durgin wrote: >>> >>> >>> From the log it looks like you're hitting the default maximum number of >>> entries to be processed at once per shard. This was intended to prevent >>> one really busy shard from blocking progress on syncing other shards, >>> since the remainder will be synced the next time the shard is processed. >>> Perhaps the default is too low though, or the idea should be scrapped >>> altogether since you can sync other shards in parallel. >>> >>> For your particular usage, since you're updating the same few buckets, >>> the max entries limit is hit constantly. You can increase it with >>> max-entries: 1000000 in the config file or --max-entries 10000000 on >>> the command line. >>> >>> Josh >> >> >> This doesn't appear to have any effect: >> root@ceph1c:/etc/init.d# grep max-entries /etc/ceph/radosgw-agent.conf >> max-entries: 1000000 >> root@ceph1c:/etc/init.d# egrep 'has [0-9]+ entries after' >> /var/log/ceph/radosgw-agent.log | tail -1 >> 2014-02-05T13:11:03.915 2743:INFO:radosgw_agent.worker:bucket instance >> "live-2:us-west-1.35026898.2" has 1000 entries after >> "00000206789.410535.3" >> >> Neither does --max-entries 10000000: >> root@ceph1c:/etc/init.d# ps auxww | grep radosgw-agent | grep max-entries >> root 19710 6.0 0.0 74492 18708 pts/3 S 13:22 0:00 >> /usr/bin/python /usr/bin/radosgw-agent --incremental-sync-delay=10 >> --max-entries 10000000 -c /etc/ceph/radosgw-agent.conf >> root@ceph1c:/etc/init.d# egrep 'has [0-9]+ entries after' >> /var/log/ceph/radosgw-agent.us-west-1.us-central-1.log | tail -1 >> 2014-02-05T13:22:58.577 21626:INFO:radosgw_agent.worker:bucket instance >> "live-2:us-west-1.35026898.2" has 1000 entries after >> "00000207788.411542.2" >> >> >> I guess I'll look into that too, since I'll be in that area of the code. > > > It seems to be a hardcoded limit on the server side to prevent a single > osd operation from taking too long (see cls_log_list() in ceph.git > src/cls/cls_log.cc). Right, that part is intentional. Otherwise osd operation might take too long. > > This should probably be fixed in radosgw, but you could work around it > with a loop in radosgw-agent. > No need to change the radosgw side, it's intentional. The agent should assume requests are paged and behave accordingly. Yehuda _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com