Re: radosgw multisite sync /admin/log requests overloading system.

Wyll Ingersoll <wyllys.ingersoll@xxxxxxxxxxxxxx> · Fri, 3 Jun 2022 19:10:20 +0000

Put another way - is there a way to throttle the metadata sync requests in a multisite cluster? They seem to be overwhelming the master zone rgw server, it constantly runs at 40%+ CPU and watching the logs it just appears to be a steady stream of /admin/log?type=metadata requests from the other zones.   Is this normal behavior?

________________________________
From: Wyll Ingersoll <wyllys.ingersoll@xxxxxxxxxxxxxx>
Sent: Wednesday, June 1, 2022 11:57 AM
To: dev@xxxxxxx <dev@xxxxxxx>
Subject: radosgw multisite sync /admin/log requests overloading system.

I have a simple multisite radosgw configuration setup for testing. There is 1 realm, 1 zonegroup, and 2 separate clusters each with its own zone.  There is 1 bucket with 1 object in it and no updates currently happening.  There is no group sync policy currently defined.

The problem I see is that the radosgw on the secondary zone is flooding the master zone with requests for the /admin/log . The radosgw on the secondary is consuming roughly 50% of the CPU cycles. The master zone radosgw is equally actiive a d is flooding the logs (at 1/5 level) with entries like this:

2022-06-01T11:45:06.719-0400 7ff415f8b700  1 ====== req done req=0x7ff5e02ed680 op status=0 http_status=200 latency=0.004000040s ======
2022-06-01T11:45:06.719-0400 7ff415f8b700  1 beast: 0x7ff5e02ed680: 10.15.1.40 - syncuser [01/Jun/2022:11:45:06.715 -0400] "GET /admin/log?type=metadata&id=4&period=92e4fbd8-3429-4cc6-a9f4-6f756ba0c592&max-entries=100&&rgwx-zonegroup=3bc6efd6-a780-4cd1-9685-376e8b477756 HTTP/1.1" 200 44 - - - latency=0.004000040s
2022-06-01T11:45:06.719-0400 7ff446fed700  1 ====== req done req=0x7ff5e0572680 op status=0 http_status=200 latency=0.004000040s ======
2022-06-01T11:45:06.719-0400 7ff446fed700  1 beast: 0x7ff5e0572680: 10.15.1.40 - syncuser [01/Jun/2022:11:45:06.715 -0400] "GET /admin/log?type=metadata&id=5&period=92e4fbd8-3429-4cc6-a9f4-6f756ba0c592&max-entries=100&&rgwx-zonegroup=3bc6efd6-a780-4cd1-9685-376e8b477756 HTTP/1.1" 200 44 - - - latency=0.004000040s

What is going on and how do I fix this?  The period on both zones is current and at the same epoch value.
Any ideas/suggestions?

thanks,
   Wyllys Ingersoll

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx