I also see this log in the RGW log: 2024-12-16T12:23:58.651+0000 7f9b2b9fe700 1 ====== starting new request req=0x7f9ad9959730 ===== 2024-12-16T12:23:58.651+0000 7f9b2b9fe700 -1 req 11778501317150336521 0.000000000s :list_data_changes_log int rgw::cls::fifo::{anonymous}::list_part(const DoutPrefixProvider*, librados::v14_2_0::IoCtx&, const string&, std::optional<std::basic_string_view<char> >, uint64_t, uint64_t, std::vector<rados::cls::fifo::part_list_entry>*, bool*, bool*, std::string*, uint64_t, optional_yield):245 fifo::op::LIST_PART failed r=-34 tid=4176 2024-12-16T12:23:58.651+0000 7f9b2b9fe700 -1 req 11778501317150336521 0.000000000s :list_data_changes_log int rgw::cls::fifo::FIFO::list(const DoutPrefixProvider*, int, std::optional<std::basic_string_view<char> >, std::vector<rgw::cls::fifo::list_entry>*, bool*, optional_yield):1660 list_entries failed: r=-34 tid= 4176 2024-12-16T12:23:58.651+0000 7f9b2b9fe700 -1 req 11778501317150336521 0.000000000s :list_data_changes_log virtual int RGWDataChangesFIFO::list(const DoutPrefixProvider*, int, int, std::vector<rgw_data_change_log_entry>&, std::optional<std::basic_string_view<char> >, std::string*, bool*): unable to list FIFO: data_log.44: (34) Numerical result out of range On Sun, Dec 15, 2024 at 10:45 PM Vahideh Alinouri < vahideh.alinouri@xxxxxxxxx> wrote: > Hi guys, > > My Ceph release is Quincy 17.2.5. I need to change the master zone to > decommission the old one and upgrade all zones. I have separated the client > traffic and sync traffic in RGWs, meaning there are separate RGW daemons > handling the sync process. > > I encountered an issue when trying to sync one of the zones in the > zonegroup. The data sync is proceeding fine, but I have an issue with the > metadata sync. It gets stuck behind on a shard. Here is the output from radosgw-admin > sync status: > > metadata sync syncing > full sync: 1/64 shards > full sync: 135 entries to sync > incremental sync: 63/64 shards > metadata is behind on 1 shard > behind shards: [0] > > In the RGW log, I see this error: > 2024-12-15T21:30:59.641+0000 7f6dff472700 1 beast: 0x7f6d2f1cf730: > 172.19.66.112 - s3-cdn-user [15/Dec/2024:21:30:59.641 +0000] "GET > /admin/log/?type=data&id=56&marker=00000000000000000000%3A00000000000000204086&extra-info=true&rgwx-zonegroup=7c01d60f-88c6-4192-baf7-d725260bf05d > HTTP/1.1" 200 44 - - - latency=0.000000000s > 2024-12-15T21:30:59.701+0000 7f6e44d1e700 0 meta sync: ERROR: full_sync(): > RGWRadosGetOmapKeysCR() returned ret=-2 > 2024-12-15T21:30:59.701+0000 7f6e44d1e700 0 RGW-SYNC:meta:shard[0]: ERROR: > failed to list omap keys, status=-2 > 2024-12-15T21:30:59.701+0000 7f6e44d1e700 0 meta sync: ERROR: > RGWBackoffControlCR called coroutine returned -2 > 2024-12-15T21:31:00.705+0000 7f6e44d1e700 0 meta sync: ERROR: full_sync(): > RGWRadosGetOmapKeysCR() returned ret=-2 > 2024-12-15T21:31:00.705+0000 7f6e44d1e700 0 RGW-SYNC:meta:shard[0]: ERROR: > failed to list omap keys, status=-2 > 2024-12-15T21:31:00.705+0000 7f6e44d1e700 0 meta sync: ERROR: > RGWBackoffControlCR called coroutine returned -2 > > I’ve tried the following steps: > > - Changed the PG number of the metadata pool to force a rebalance, but > everything was fine. > - Ran metadata sync init and tried to run it again. > - Restarted RGW services in both the zone and the master zone. > - Created a user in the master zone to ensure metadata sync works, > which was successful. > - Checked OSD logs but didn’t see any specific errors. > - Attempted to list metadata in the pool using rados ls -p > s3-cdn-dc07.rgw.meta, but got an empty result. > - Compared the code for listing OMAP keys between Quincy and Squid > versions; there were no specific changes. > > I’m looking for any advice or suggestions to resolve this issue. > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx