Re: RGW Replication

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Craig Lewis
Senior Systems Engineer
Office +1.714.602.1309
Email clewis@xxxxxxxxxxxxxxxxxx

Central Desktop. Work together in ways you never thought possible.
Connect with us   Website  |  Twitter  |  Facebook  |  LinkedIn  |  Blog

On 2/5/14 14:37 , Yehuda Sadeh wrote:
On Wed, Feb 5, 2014 at 2:21 PM, Josh Durgin <josh.durgin@xxxxxxxxxxx> wrote:
On 02/05/2014 01:23 PM, Craig Lewis wrote:

On 2/4/14 20:02 , Josh Durgin wrote:

>From the log it looks like you're hitting the default maximum number of
entries to be processed at once per shard. This was intended to prevent
one really busy shard from blocking progress on syncing other shards,
since the remainder will be synced the next time the shard is processed.
Perhaps the default is too low though, or the idea should be scrapped
altogether since you can sync other shards in parallel.

For your particular usage, since you're updating the same few buckets,
the max entries limit is hit constantly. You can increase it with
max-entries: 1000000 in the config file or --max-entries 10000000 on
the command line.

Josh

This doesn't appear to have any effect:
root@ceph1c:/etc/init.d# grep max-entries /etc/ceph/radosgw-agent.conf
max-entries: 1000000
root@ceph1c:/etc/init.d# egrep 'has [0-9]+ entries after'
/var/log/ceph/radosgw-agent.log  | tail -1
2014-02-05T13:11:03.915 2743:INFO:radosgw_agent.worker:bucket instance
"live-2:us-west-1.35026898.2" has 1000 entries after
"00000206789.410535.3"

Neither does --max-entries 10000000:
root@ceph1c:/etc/init.d# ps auxww | grep radosgw-agent | grep max-entries
root     19710  6.0  0.0  74492 18708 pts/3    S    13:22 0:00
/usr/bin/python /usr/bin/radosgw-agent --incremental-sync-delay=10
--max-entries 10000000 -c /etc/ceph/radosgw-agent.conf
root@ceph1c:/etc/init.d# egrep 'has [0-9]+ entries after'
/var/log/ceph/radosgw-agent.us-west-1.us-central-1.log  | tail -1
2014-02-05T13:22:58.577 21626:INFO:radosgw_agent.worker:bucket instance
"live-2:us-west-1.35026898.2" has 1000 entries after
"00000207788.411542.2"


I guess I'll look into that too, since I'll be in that area of the code.

It seems to be a hardcoded limit on the server side to prevent a single
osd operation from taking too long (see cls_log_list() in ceph.git
src/cls/cls_log.cc).
Right, that part is intentional. Otherwise osd operation might take too long.

This should probably be fixed in radosgw, but you could work around it
with a loop in radosgw-agent.

No need to change the radosgw side, it's intentional. The agent should
assume requests are paged and behave accordingly.

Yehuda

For the record, I can't lower the value either:
root@ceph1c:/etc/init.d# ps auxww | grep radosgw-agent | grep max-entries
root     16151  1.6  0.0 222384 18980 pts/3    Sl   15:22   0:01 /usr/bin/python /usr/bin/radosgw-agent --incremental-sync-delay=10 --max-entries 888 -c /etc/ceph/radosgw.replicate.us-west-1-to-us-central-1.conf
root     17417  0.9  0.0 225056 19928 pts/3    Sl   15:22   0:00 /usr/bin/python /usr/bin/radosgw-agent --incremental-sync-delay=10 --max-entries 888 -c /etc/ceph/radosgw.replicate.us-west-1-to-us-central-1.conf
root@ceph1c:/etc/init.d# egrep 'has [0-9]+ entries after' /var/log/ceph/radosgw-agent.us-west-1.us-central-1.log  | tail -1
2014-02-05T15:23:00.802 17417:INFO:radosgw_agent.worker:bucket instance "live-2:us-west-1.35026898.2" has 1000 entries after "00000217778.421759.2"

I'll take a stab at it.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux