This generally shouldn't be a problem at your bucket sizes. Have you checked that the cluster is actually in a healthy state? The sleeping locks are normal but should be getting woken up; if they aren't it means the object access isn't working for some reason. A down PG or something would be the simplest explanation. -Greg On Fri, Aug 28, 2015 at 6:52 PM, Sam Wouters <sam@xxxxxxxxx> wrote: > Ok, maybe I'm to impatient. It would be great if there were some verbose > or progress logging of the radosgw-admin tool. > I will start a check and let it run over the weekend. > > tnx, > Sam > > On 28-08-15 18:16, Sam Wouters wrote: >> Hi, >> >> this bucket only has 13389 objects, so the index size shouldn't be a >> problem. Also, on the same cluster we have an other bucket with 1200543 >> objects (but no versioning configured), which has no issues. >> >> when we run a radosgw-admin bucket --check (--fix), nothing seems to be >> happening. Putting an strace on the process shows a lot of lines like these: >> [pid 99372] futex(0x2d730d4, FUTEX_WAIT_PRIVATE, 156619, NULL >> <unfinished ...> >> [pid 99385] futex(0x2da9410, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...> >> [pid 99371] futex(0x2da9410, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> >> [pid 99385] <... futex resumed> ) = -1 EAGAIN (Resource >> temporarily unavailable) >> [pid 99371] <... futex resumed> ) = 0 >> >> but no errors in the ceph logs or health warnings. >> >> r, >> Sam >> >> On 28-08-15 17:49, Ben Hines wrote: >>> How many objects in the bucket? >>> >>> RGW has problems with index size once number of objects gets into the >>> 900000+ level. The buckets need to be recreated with 'sharded bucket >>> indexes' on: >>> >>> rgw override bucket index max shards = 23 >>> >>> You could also try repairing the index with: >>> >>> radosgw-admin bucket check --fix --bucket=<bucketname> >>> >>> -Ben >>> >>> On Fri, Aug 28, 2015 at 8:38 AM, Sam Wouters <sam@xxxxxxxxx> wrote: >>>> Hi, >>>> >>>> we have a rgw bucket (with versioning) where PUT and GET operations for >>>> specific objects succeed, but retrieving an object list fails. >>>> Using python-boto, after a timeout just gives us an 500 internal error; >>>> radosgw-admin just hangs. >>>> Also a radosgw-admin bucket check just seems to hang... >>>> >>>> ceph version is 0.94.3 but this also was happening with 0.94.2, we >>>> quietly hoped upgrading would fix but it didn't... >>>> >>>> r, >>>> Sam >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@xxxxxxxxxxxxxx >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com