Hi, I have trouble with large OMAP files in a cluster in the RGW index pool. Some background information about the cluster: There is CephFS and RBD usage on the main cluster but for this issue I think only S3 is interesting. There is one realm, one zonegroup with two zones which have a bidirectional sync set up. Since this does not allow for autoresharding we have to do it by hand in this cluster – looking forward to Reef! >From the logs: cluster 2023-07-17T22:59:03.018722+0000 osd.75 (osd.75) 623978 : cluster [WRN] Large omap object found. Object: 34:bcec3016:::.dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.5:head PG: 34.680c373d (34.5) Key count: 962091 Size (bytes): 277963182 The offending bucket looks like this: # radosgw-admin bucket stats \ | jq '.[] | select(.marker =="3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9") |"\(.num_shards) \(.usage["rgw.main"].num_objects)"' -r 131 9463833 Last week the number of objects was about 12 million. Which is why I reshareded the offending bucket twice, I think. Once to 129 and the second time to 131 because I wanted some leeway (or lieway? scnr, Sage). Unfortunately, even after a week the objects were still to big (the log line above is quite recent), so I looked into it again. # rados -p raum.rgw.buckets.index ls \ |grep .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9 \ |sort -V .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.0 .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.1 .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.2 .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.3 .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.4 .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.5 .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.6 .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.7 .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.8 .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.9 .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.10 # rados -p raum.rgw.buckets.index ls \ |grep .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9 \ |sort -V \ |xargs -IOMAP sh -c \ 'rados -p raum.rgw.buckets.index listomapkeys OMAP | wc -l' 1013854 1011007 1012287 1011232 1013565 998262 1012777 1012713 1012230 1010690 997111 Apparently, only 11 shards are in use. This would explain why the "Key usage" (from the log line) is about ten times higher than I would expect. How can I deal with this issue? One thing I could try to fix this would be to reshard to a lower number, but I am not sure if there are any risks associated with "downsharding". After that I could reshard to something like 97. Or I could directly "downshard" to 97. Also, the second zone has a similar problem, but as the error messsage lets me know, this would be a bad idea. Will it just take more time until the sharding is transferred to the seconds zone? Best, Christian Kugler _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx