1. I recommend that you *not* issue another bucket reshard until you figure out what’s going on. 2. Which version of Ceph are you using? 3. Can you issue a `radosgw-admin metadata get bucket:<bucket-name>` so we can verify what the current marker is? 4. After you resharded previously, did you get command-line output along the lines of: 2023-07-24T13:33:50.867-0400 7f10359f2a80 1 execute INFO: reshard of bucket “<bucket-name>" completed successfully Eric (he/him) P.S. It’s likely obvious, but in the above replace <bucket-name> with the actual bucket name. > On Jul 18, 2023, at 10:18 AM, Christian Kugler <syphdias+ceph@xxxxxxxxx> wrote: > > Hi, > > I have trouble with large OMAP files in a cluster in the RGW index pool. Some > background information about the cluster: There is CephFS and RBD usage on the > main cluster but for this issue I think only S3 is interesting. > There is one realm, one zonegroup with two zones which have a bidirectional sync > set up. Since this does not allow for autoresharding we have to do it by hand in > this cluster – looking forward to Reef! > > From the logs: > cluster 2023-07-17T22:59:03.018722+0000 osd.75 (osd.75) 623978 : > cluster [WRN] Large omap object found. Object: > 34:bcec3016:::.dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.5:head > PG: 34.680c373d (34.5) Key count: 962091 Size (bytes): 277963182 > > The offending bucket looks like this: > # radosgw-admin bucket stats \ > | jq '.[] | select(.marker > =="3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9") > |"\(.num_shards) \(.usage["rgw.main"].num_objects)"' -r > 131 9463833 > > Last week the number of objects was about 12 million. Which is why I reshareded > the offending bucket twice, I think. Once to 129 and the second time to 131 > because I wanted some leeway (or lieway? scnr, Sage). > > Unfortunately, even after a week the objects were still to big (the log line > above is quite recent), so I looked into it again. > > # rados -p raum.rgw.buckets.index ls \ > |grep .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9 \ > |sort -V > .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.0 > .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.1 > .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.2 > .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.3 > .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.4 > .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.5 > .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.6 > .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.7 > .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.8 > .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.9 > .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.10 > # rados -p raum.rgw.buckets.index ls \ > |grep .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9 \ > |sort -V \ > |xargs -IOMAP sh -c \ > 'rados -p raum.rgw.buckets.index listomapkeys OMAP | wc -l' > 1013854 > 1011007 > 1012287 > 1011232 > 1013565 > 998262 > 1012777 > 1012713 > 1012230 > 1010690 > 997111 > > Apparently, only 11 shards are in use. This would explain why the "Key usage" > (from the log line) is about ten times higher than I would expect. > > How can I deal with this issue? > One thing I could try to fix this would be to reshard to a lower number, but I > am not sure if there are any risks associated with "downsharding". After that I > could reshard to something like 97. Or I could directly "downshard" to 97. > > Also, the second zone has a similar problem, but as the error messsage lets me > know, this would be a bad idea. Will it just take more time until the sharding > is transferred to the seconds zone? > > Best, > Christian Kugler > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx