Hey Ben,
Could you include the following?
Thanks, From: ceph-users <ceph-users-bounces@xxxxxxxxxxxxxx> on behalf of Benjamin.Zieglmeier <Benjamin.Zieglmeier@xxxxxxxxxx>
Sent: Tuesday, February 26, 2019 9:33 AM To: ceph-users@xxxxxxxxxxxxxx Subject: Re: Multi-Site Cluster RGW Sync issues Hello,
We have a two zone multisite configured Luminous 12.2.5 cluster. Cluster has been running for about 1 year, and has only ~140G of data (~350k objects). We recently added a third zone to the zonegroup to facilitate a migration out of an existing site. Sync appears to be working and running `radosgw-admin sync status` and `radosgw-admin sync status –rgw-zone=<new zone name>` reflects the same. The problem we are having, is that once the data replication completes, one of the rgws serving the new zone has the radosgw process consuming all the CPU, and the rgw log is flooded with “ERROR: failed to read mdlog info with (2) No such file or directory”, to the amount of 1000 log entries/sec.
This has been happening for days on end now, and are concerned about what is going on between these two zones. Logs are constantly filling up on the rgws and we are out of ideas. Are they trying to catch up on metadata? After extensive searching and racking our brains, we are unable to figure out what is causing all these requests (and errors) between the two zones.
Thanks, Ben |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com