The output has 57000 lines (and growing). I’ve uploaded the output to:
https://gist.github.com/zieg8301/7e6952e9964c1e0964fb63f61e7b7be7 Thanks, Ben From: Matthew H <matthew.heler@xxxxxxxxxxx> Hey Ben, Could you include the following?
From: ceph-users <ceph-users-bounces@xxxxxxxxxxxxxx> on behalf of Benjamin.Zieglmeier <Benjamin.Zieglmeier@xxxxxxxxxx> Hello, We have a two zone multisite configured Luminous 12.2.5 cluster. Cluster has been running for about 1 year, and has only ~140G of data (~350k objects). We recently added a third zone to the zonegroup to facilitate a migration out of an
existing site. Sync appears to be working and running `radosgw-admin sync status` and `radosgw-admin sync status –rgw-zone=<new zone name>` reflects the same. The problem we are having, is that once the data replication completes, one of the rgws serving the
new zone has the radosgw process consuming all the CPU, and the rgw log is flooded with “ERROR: failed to read mdlog info with (2) No such file or directory”, to the amount of 1000 log entries/sec.
This has been happening for days on end now, and are concerned about what is going on between these two zones. Logs are constantly filling up on the rgws and we are out of ideas. Are they trying to catch up on metadata? After extensive
searching and racking our brains, we are unable to figure out what is causing all these requests (and errors) between the two zones.
Thanks, Ben |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com