For anyone interested or having similar issues, I figured out what was wrong by running # radosgw-admin --cluster ceph bucket check --bucket=12856/weird_bucket --check-objects > obj.check.out Reviewed the +1M entries in the file, wasn't really sure what the output was about but figured it was probably objects in the bucket that was considered broken somehow, running # radosgw-admin --cluster ceph object stat --bucket=weird_bucket --object=$OBJECT on some of these objects returned File not found. # radosgw-admin --cluster ceph bucket check --bucket=12856/weird_bucket --check-objects --fix In hopes that it would fix the index, removing the dead object entries, but it didn't, not sure why, --fix might be something else, the --help text just says "besides checking bucket index, will also fix it" :). I picked out some dead objects and attached the bucket instance id in front of it and had the rados command put a dummy file into the buckets.data pool # rados -c /etc/ceph/ceph.conf -p ceph.rgw.buckets.data put be8fa19b-ad79-4cd8-ac7b-1e14fdc882f6.2384280.20_$OBJECT dummy.file Lo and behold, the rm command was finally able to remove the objects. Realizing the stale object entries in the index was closely the same amount of objects 'bucket stats' reported I gave up on scripting rados to put dummy files into the stale entries and just went ahead with Red Hats solution of removing stale buckets (https://access.redhat.com/solutions/2110551)* since it was just something like 30 "real" actual objects in the bucket, having these floating around without a bucket was a lower cost than spending time on scripting and then removing the bucket. I'm not sure how I ended up in the state with this many stale entries, it might have something to do with the user owning this bucket also had a lot of other bucket indexes that were oversized +6M objects without index sharding (resharding doesn't work that well, different thread) in an multi-site environment that had RGW's crashing every now and then due to memory leak bugs and said oversized indexes being altered all at the same time * RH solution article is for Hammer, I'm using Jewel 10.2.7 It was great fun, hope this helps anyone having similar issues. Cheers! /andreas On 8 August 2017 at 12:31, Andreas Calminder <andreas.calminder@xxxxxxxxxx> wrote: > Hi, > I'm running into a weird issue while trying to delete a bucket with > radosgw-admin > > # radosgw-admin --cluster ceph bucket rm --bucket=12856/weird_bucket > --purge-objects > > This returns almost instantly even though the bucket contains +1M > objects and the bucket isn't removed. Running above command with debug > flags (--debug-rgw=20 --debug-ms 20) > > I notice the session closing down after encountering: > 2017-08-08 10:51:52.032946 7f8a9caf4700 10 -- CLIENT_IP:0/482026554 >> > ENDPOINT_IP:6800/5740 pipe(0x7f8ac2acc8c0 sd=7 :3482 s=2 pgs=7856733 > cs=1 l=1 c=0x7f8ac2acb3a0).reader got message 8 0x7f8a64001640 > osd_op_reply(218 > be8fa19b-ad79-4cd8-ac7b-1e14fdc882f6.2384280.20_a_weird_object > [getxattrs,stat] v0'0 uv0 ack = -2 ((2) No such file or directory)) v7 > 2017-08-08 10:51:52.032970 7f8a9caf4700 1 -- CLIENT_IP:0/482026554 > <== osd.47 ENDPOINT_IP:6800/5740 8 ==== osd_op_reply(218 > be8fa19b-ad79-4cd8-ac7b-1e14fdc882f6.2384280.20_a_weird_object > [getxattrs,stat] v0'0 uv0 ack = -2 ((2) No such file or directory)) v7 > ==== 317+0+0 (3298345941 0 0) 0x7f8a64001640 con 0x7f8ac2acb3a0 > > If I understand the output correctly, the file wasn't found and the > session was closed down. The radosgw-admin command doesn't hint that > anything bad has happened though. > > Anyone seen this behaviour or anything similar? Any pointers of how to > fix it, I just want to get rid of the bucket since it's both > over-sized and unused. > > Best regards, > Andreas _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com