Re: RGW/multisite sync traffic rps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I see the same issue (45k GET requests constantly as admin), what my guess is, the primary site is putting to the datalog the changes and the secondary sites are pulling these logs as it changes.
Do you have user who constantly uploading, deleting?

Istvan Szabo
Senior Infrastructure Engineer
---------------------------------------------------
Agoda Services Co., Ltd.
e: istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx>
---------------------------------------------------

On 2021. Oct 22., at 10:46, Stefan Schueffler <s.schueffler@xxxxxxxxxxxxx> wrote:

Email received from the internet. If in doubt, don't click any link nor open any attachment !
________________________________

Hi,

i have a question on RGW/multisite. The sync traffic is running a lot of requests per second (around 1500), which seems to be high, especially compared to the actual volume of user/client-requests.

We have a rather simple multisite-setup with
- two ceph clusters (16.2.6), 1 realm, 1 zonegroup, and one zone on each side, one of them ist the master zone.
- latency between those cluster around 0.3ms
- each cluster has 3 RGW/beast daemons running.
- a handful of buckets (around 20), and a check script which creates one bucket per second (and deletes it after validating the successful bucket creation).
- one of the buckets has a few million (smaller) objects, the others are (more or less) empty.
- from the client side, there are just a few requests per second (mostly PUT objects into the one larger bucket), writing a few kilobytes per second.
- roughly 5 GB in total disk size consumed currently, with the idea to increase the total consumption to a few TB over time.

Both clusters are in sync (after the initial full sync, they now do incremental sync). Although they do sync the new objects from cluster A (master, to which the clients connect to) to B, we see a lot of „internal“ sync requests in our monitoring: each rgw daemon does about 500 requests per second to a rgw daemon on cluster A, especially to "/admin/log?…", which leads to a total of 1500 requests per second just for the sync, and this results in almost 60% cpu usage for the rgw/beast processes.

When stopping and restarting the rgw-instances on cluster-B, it first catches up with the delta, and as soon as it finishes, it starts to request in this endless loop "/admin/log…"

Is this amount of internal, sync-related requests normal and expected?

Thanks for any ideas how to debug / introspect this.

Best
Stefan

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux