On Wed, 2022-11-23 at 12:57 -0500, Casey Bodley wrote: > hi Jan, > > On Wed, Nov 23, 2022 at 12:45 PM Jan Horstmann <J.Horstmann@xxxxxxxxxxx> wrote: > > > > Hi list, > > I am completely lost trying to reshard a radosgw bucket which fails > > with the error: > > > > process_single_logshard: Error during resharding bucket > > 68ddc61c613a4e3096ca8c349ee37f56/snapshotnfs:(2) No such file or > > directory > > > > But let me start from the beginning. We are running a ceph cluster > > version 15.2.17. Recently we received a health warning because of > > "large omap objects". So I grepped through the logs to get more > > information about the object and then mapped that to a radosgw bucket > > instance ([1]). > > I believe this should normally be handled by dynamic resharding of the > > bucket, which has already been done 23 times for this bucket ([2]). > > For recent resharding tries the radosgw is logging the error mentioned > > at the beginning. I tried to reshard manually by following the process > > in [3], but that consequently leads to the same error. > > When running the reshard with debug options ( --debug-rgw=20 --debug- > > ms=1) I can get some additional insight on where exactly the failure > > occurs: > > > > 2022-11-23T10:41:20.754+0000 7f58cf9d2080 1 -- > > 10.38.128.3:0/1221656497 --> > > [v2:10.38.128.6:6880/44286,v1:10.38.128.6:6881/44286] -- > > osd_op(unknown.0.0:46 5.6 5:66924383:reshard::reshard.0000000005:head > > [call rgw.reshard_get in=149b] snapc 0=[] > > ondisk+read+known_if_redirected e44374) v8 -- 0x56092dd46a10 con > > 0x56092dcfd7a0 > > 2022-11-23T10:41:20.754+0000 7f58bb889700 1 -- > > 10.38.128.3:0/1221656497 <== osd.210 v2:10.38.128.6:6880/44286 4 ==== > > osd_op_reply(46 reshard.0000000005 [call] v0'0 uv1180019 ondisk = -2 > > ((2) No such file or directory)) v8 ==== 162+0+0 (crc 0 0 0) > > 0x7f58b00dc020 con 0x56092dcfd7a0 > > > > > > I am not sure how to interpret this and how to debug this any further. > > Of course I can provide the full output if that helps. > > > > Thanks and regards, > > Jan > > > > [1] > > root@ceph-mon1:~# grep -r 'Large omap object found. Object' > > /var/log/ceph/ceph.log > > 2022-11-15T14:47:28.900679+0000 osd.47 (osd.47) 10890 : cluster [WRN] > > Large omap object found. Object: 3:9660022b:::.dir.ee3fa6a3-4af3-4ac2- > > 86c2-d2c374080b54.63073818.19.9:head PG: 3.d4400669 (3.29) Key count: > > 336457 Size (bytes): 117560231 > > 2022-11-17T04:51:43.593811+0000 osd.50 (osd.50) 90 : cluster [WRN] > > Large omap object found. Object: 3:0de49b75:::.dir.ee3fa6a3-4af3-4ac2- > > 86c2-d2c374080b54.63073818.19.10:head PG: 3.aed927b0 (3.30) Key count: > > 205346 Size (bytes): 71669614 > > 2022-11-18T02:55:07.182419+0000 osd.47 (osd.47) 10917 : cluster [WRN] > > Large omap object found. Object: 3:9660022b:::.dir.ee3fa6a3-4af3-4ac2- > > 86c2-d2c374080b54.63073818.19.9:head PG: 3.d4400669 (3.29) Key count: > > 449776 Size (bytes): 157310435 > > 2022-11-19T09:56:47.630679+0000 osd.29 (osd.29) 114 : cluster [WRN] > > Large omap object found. Object: 3:61ad76c5:::.dir.ee3fa6a3-4af3-4ac2- > > 86c2-d2c374080b54.63073818.19.12:head PG: 3.a36eb586 (3.6) Key count: > > 213843 Size (bytes): 74703544 > > 2022-11-20T13:04:39.979349+0000 osd.72 (osd.72) 83 : cluster [WRN] > > Large omap object found. Object: 3:2b3227e7:::.dir.ee3fa6a3-4af3-4ac2- > > 86c2-d2c374080b54.63073818.19.22:head PG: 3.e7e44cd4 (3.14) Key count: > > 326676 Size (bytes): 114453145 > > 2022-11-21T02:53:32.410698+0000 osd.50 (osd.50) 151 : cluster [WRN] > > Large omap object found. Object: 3:0de49b75:::.dir.ee3fa6a3-4af3-4ac2- > > 86c2-d2c374080b54.63073818.19.10:head PG: 3.aed927b0 (3.30) Key count: > > 216764 Size (bytes): 75674839 > > 2022-11-22T18:04:09.757825+0000 osd.47 (osd.47) 10964 : cluster [WRN] > > Large omap object found. Object: 3:9660022b:::.dir.ee3fa6a3-4af3-4ac2- > > 86c2-d2c374080b54.63073818.19.9:head PG: 3.d4400669 (3.29) Key count: > > 449776 Size (bytes): 157310435 > > 2022-11-23T00:44:55.316254+0000 osd.29 (osd.29) 163 : cluster [WRN] > > Large omap object found. Object: 3:61ad76c5:::.dir.ee3fa6a3-4af3-4ac2- > > 86c2-d2c374080b54.63073818.19.12:head PG: 3.a36eb586 (3.6) Key count: > > 213843 Size (bytes): 74703544 > > 2022-11-23T09:10:07.842425+0000 osd.55 (osd.55) 13968 : cluster [WRN] > > Large omap object found. Object: 3:3fa378c9:::.dir.ee3fa6a3-4af3-4ac2- > > 86c2-d2c374080b54.63073818.19.20:head PG: 3.931ec5fc (3.3c) Key count: > > 219204 Size (bytes): 76509687 > > 2022-11-23T09:11:15.516973+0000 osd.72 (osd.72) 112 : cluster [WRN] > > Large omap object found. Object: 3:2b3227e7:::.dir.ee3fa6a3-4af3-4ac2- > > 86c2-d2c374080b54.63073818.19.22:head PG: 3.e7e44cd4 (3.14) Key count: > > 326676 Size (bytes): 114453145 > > root@ceph-mon1:~# radosgw-admin metadata list "bucket.instance" | grep > > ee3fa6a3-4af3-4ac2-86c2-d2c374080b54.63073818.19 > > "68ddc61c613a4e3096ca8c349ee37f56/snapshotnfs:ee3fa6a3-4af3-4ac2- > > 86c2-d2c374080b54.63073818.19", > > > > [2] > > root@ceph-mon1:~# radosgw-admin bucket stats --bucket > > 68ddc61c613a4e3096ca8c349ee37f56/snapshotnfs > > { > > "bucket": "snapshotnfs", > > "num_shards": 23, > > "tenant": "68ddc61c613a4e3096ca8c349ee37f56", > > "zonegroup": "bf22bf53-c135-450b-946f-97e16d1bc326", > > "placement_rule": "default-placement", > > "explicit_placement": { > > "data_pool": "", > > "data_extra_pool": "", > > "index_pool": "" > > }, > > "id": "ee3fa6a3-4af3-4ac2-86c2-d2c374080b54.63073818.19", > > "marker": "ee3fa6a3-4af3-4ac2-86c2-d2c374080b54.63090893.15", > > "index_type": "Normal", > > "owner": > > "68ddc61c613a4e3096ca8c349ee37f56$68ddc61c613a4e3096ca8c349ee37f56", > > "ver": > > "0#205,1#32,2#78,3#41,4#25,5#23,6#30,7#94732,8#24,9#190897,10#93417,11 > > #128,12#91536,13#23,14#407,15#137262,16#24,17#32,18#104,19#63,20#94213 > > ,21#24,22#140543", > > "master_ver": > > "0#0,1#0,2#0,3#0,4#0,5#0,6#0,7#0,8#0,9#0,10#0,11#0,12#0,13#0,14#0,15#0 > > ,16#0,17#0,18#0,19#0,20#0,21#0,22#0", > > "mtime": "2022-11-14T07:55:28.287021Z", > > "creation_time": "2022-11-07T07:08:58.874542Z", > > "max_marker": > > "0#,1#,2#,3#,4#,5#,6#,7#,8#,9#,10#,11#,12#,13#,14#,15#,16#,17#,18#,19# > > ,20#,21#,22#", > > "usage": { > > "rgw.main": { > > "size": 36246282736024, > > "size_actual": 36246329626624, > > "size_utilized": 36246282736024, > > "size_kb": 35396760485, > > "size_kb_actual": 35396806276, > > "size_kb_utilized": 35396760485, > > "num_objects": 1837484 > > }, > > "rgw.multimeta": { > > "size": 0, > > "size_actual": 0, > > "size_utilized": 24570, > > "size_kb": 0, > > "size_kb_actual": 0, > > "size_kb_utilized": 24, > > "num_objects": 910 > > } > > }, > > "bucket_quota": { > > "enabled": false, > > "check_on_raw": true, > > "max_size": -1, > > "max_size_kb": 0, > > "max_objects": -1 > > } > > } > > > > [3] > > https://docs.ceph.com/en/octopus/radosgw/dynamicresharding/ > > > > root@ceph-mon1:~# radosgw-admin reshard add --bucket > > 68ddc61c613a4e3096ca8c349ee37f56/snapshotnfs --num-shards 29 > > root@ceph-mon1:~# radosgw-admin reshard list > > [ > > { > > "time": "2022-11-23T10:38:25.690183Z", > > "tenant": "", > > "bucket_name": "68ddc61c613a4e3096ca8c349ee37f56/snapshotnfs", > > it doesn't look like the 'reshard add' command understands this > "tenant/bucket" format you provided. you might try specifying the > --tenant separately > Thank you, that did the trick. After processing the reshard and deep scrubbing the affected pgs health went back to okay. Now I am left to wonder why there was no dynamic resharding though. > > "bucket_id": "ee3fa6a3-4af3-4ac2-86c2- > > d2c374080b54.63073818.19", > > "new_instance_id": "", > > "old_num_shards": 23, > > "new_num_shards": 29 > > } > > ] > > root@ceph-mon1:~# radosgw-admin reshard process > > 2022-11-23T10:41:20.758+0000 7f58cf9d2080 0 process_single_logshard: > > Error during resharding bucket > > 68ddc61c613a4e3096ca8c349ee37f56/snapshotnfs:(2) No such file or > > directory > > > > > > > > > > -- > > Jan Horstmann > > Systementwickler | Infrastruktur > > _____ > > > > > > Mittwald CM Service GmbH & Co. KG > > Königsberger Straße 4-6 > > 32339 Espelkamp > > > > Tel.: 05772 / 293-900 > > Fax: 05772 / 293-333 > > > > j.horstmann@xxxxxxxxxxx > > https://www.mittwald.de > > > > Geschäftsführer: Robert Meyer, Florian Jürgens > > > > USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen > > Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad > > Oeynhausen > > > > Informationen zur Datenverarbeitung im Rahmen unserer > > Geschäftstätigkeit > > gemäß Art. 13-14 DSGVO sind unter www.mittwald.de/ds abrufbar. > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx