Hi guys, My Ceph release is Quincy 17.2.5. I need to change the master zone to decommission the old one and upgrade all zones. I have separated the client traffic and sync traffic in RGWs, meaning there are separate RGW daemons handling the sync process. I encountered an issue when trying to sync one of the zones in the zonegroup. The data sync is proceeding fine, but I have an issue with the metadata sync. It gets stuck behind on a shard. Here is the output from radosgw-admin sync status: metadata sync syncing full sync: 1/64 shards full sync: 135 entries to sync incremental sync: 63/64 shards metadata is behind on 1 shard behind shards: [0] In the RGW log, I see this error: 2024-12-15T21:30:59.641+0000 7f6dff472700 1 beast: 0x7f6d2f1cf730: 172.19.66.112 - s3-cdn-user [15/Dec/2024:21:30:59.641 +0000] "GET /admin/log/?type=data&id=56&marker=00000000000000000000%3A00000000000000204086&extra-info=true&rgwx-zonegroup=7c01d60f-88c6-4192-baf7-d725260bf05d HTTP/1.1" 200 44 - - - latency=0.000000000s 2024-12-15T21:30:59.701+0000 7f6e44d1e700 0 meta sync: ERROR: full_sync(): RGWRadosGetOmapKeysCR() returned ret=-2 2024-12-15T21:30:59.701+0000 7f6e44d1e700 0 RGW-SYNC:meta:shard[0]: ERROR: failed to list omap keys, status=-2 2024-12-15T21:30:59.701+0000 7f6e44d1e700 0 meta sync: ERROR: RGWBackoffControlCR called coroutine returned -2 2024-12-15T21:31:00.705+0000 7f6e44d1e700 0 meta sync: ERROR: full_sync(): RGWRadosGetOmapKeysCR() returned ret=-2 2024-12-15T21:31:00.705+0000 7f6e44d1e700 0 RGW-SYNC:meta:shard[0]: ERROR: failed to list omap keys, status=-2 2024-12-15T21:31:00.705+0000 7f6e44d1e700 0 meta sync: ERROR: RGWBackoffControlCR called coroutine returned -2 I’ve tried the following steps: - Changed the PG number of the metadata pool to force a rebalance, but everything was fine. - Ran metadata sync init and tried to run it again. - Restarted RGW services in both the zone and the master zone. - Created a user in the master zone to ensure metadata sync works, which was successful. - Checked OSD logs but didn’t see any specific errors. - Attempted to list metadata in the pool using rados ls -p s3-cdn-dc07.rgw.meta, but got an empty result. - Compared the code for listing OMAP keys between Quincy and Squid versions; there were no specific changes. I’m looking for any advice or suggestions to resolve this issue. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx