Re: Troubleshooting rgw bucket list

Yehuda Sadeh-Weinraub <yehuda@xxxxxxxxxx> · Tue, 1 Sep 2015 13:37:52 -0700

Yeah, I'm able to reproduce the issue. It is related to the fact that
you have a bunch of delete markers in the bucket, as it triggers some
bug there. I opened a new ceph issue for this one:

http://tracker.ceph.com/issues/12913

Thanks,
Yehuda

On Tue, Sep 1, 2015 at 11:39 AM, Sam Wouters <sam@xxxxxxxxx> wrote:
> Sorry, forgot to mention:
>
> - yes, filtered by thread
> - the "is not valid" line occurred when performing the bucket --check
> - when doing a bucket listing, I also get an "is not valid", but on a
> different object:
> 7fe4f1d5b700 20 <cls> cls/rgw/cls_rgw.cc:460: entry
> abc_econtract/data/6scbrrlo4vttk72melewizj6n3[] is not valid
>
> bilog entry for this object similar to the one below
>
> r, Sam
>
> On 01-09-15 20:30, Sam Wouters wrote:
>> Hi,
>>
>> see inline
>>
>> On 01-09-15 20:14, Yehuda Sadeh-Weinraub wrote:
>>> I assume you filtered the log by thread? I don't see the response
>>> messages. For the bucket check you can run radosgw-admin with
>>> --log-to-stderr.
>> nothing is logged to the console when I do that
>>> Can you also set 'debug objclass = 20' on the osds? You can do it by:
>>>
>>> $ ceph tell osd.\* injectargs --debug-objclass 20
>> this continuously prints "20 <cls> cls/rgw/cls_rgw.cc:460: entry
>> abc_econtract/data/6smuz2ysavvxbygng34tgusyse[] is not valid" on osd.0
>>> Also, it'd be interesting to get the following:
>>>
>>> $ radosgw-admin bi list --bucket=<bucket name>
>>> --object=abc_econtract/data/6shflrwbwwcm6dsemrpjit2li3v913iad1EZQ3.S6Prb-NXLvfQRlaWC5nBYp5
>> this gives me an empty array:
>> [
>> ]
>> but we did a trim of the bilog a while ago cause a lot entries regarding
>> objects that were already removed from the bucket kept on syncing with
>> the sync agent, causing a lot of delete_markers at the replication site.
>>
>> The object in the error above from the osd log, gives the following:
>> # radosgw-admin --log-to-stderr -n client.radosgw.be-east-1 bi list
>> --bucket=aws-cmis-prod
>> --object=abc_econtract/data/6smuz2ysavvxbygng34tgusyse
>> [
>>     {
>>         "type": "plain",
>>         "idx": "abc_econtract\/data\/6smuz2ysavvxbygng34tgusyse",
>>         "entry": {
>>             "name": "abc_econtract\/data\/6smuz2ysavvxbygng34tgusyse",
>>             "instance": "",
>>             "ver": {
>>                 "pool": -1,
>>                 "epoch": 0
>>             },
>>             "locator": "",
>>             "exists": "false",
>>             "meta": {
>>                 "category": 0,
>>                 "size": 0,
>>                 "mtime": "0.000000",
>>                 "etag": "",
>>                 "owner": "",
>>                 "owner_display_name": "",
>>                 "content_type": "",
>>                 "accounted_size": 0
>>             },
>>             "tag": "",
>>             "flags": 8,
>>             "pending_map": [],
>>             "versioned_epoch": 0
>>         }
>>     },
>>     {
>>         "type": "plain",
>>         "idx":
>> "abc_econtract\/data\/6smuz2ysavvxbygng34tgusyse\u0000v913\u0000iRQZUR76UdeymR-PGaw6sbCHMCOcaovu",
>>         "entry": {
>>             "name": "abc_econtract\/data\/6smuz2ysavvxbygng34tgusyse",
>>             "instance": "RQZUR76UdeymR-PGaw6sbCHMCOcaovu",
>>             "ver": {
>>                 "pool": 23,
>>                 "epoch": 9680
>>             },
>>             "locator": "",
>>             "exists": "true",
>>             "meta": {
>>                 "category": 1,
>>                 "size": 103410,
>>                 "mtime": "2015-08-07 17:57:32.000000Z",
>>                 "etag": "6c67f5e6cb4aa63f4fa26a3b94d19d3a",
>>                 "owner": "aws-cmis-prod",
>>                 "owner_display_name": "AWS-CMIS prod user",
>>                 "content_type": "application\/pdf",
>>                 "accounted_size": 103410
>>             },
>>             "tag": "be-east.34319.4520377",
>>             "flags": 3,
>>             "pending_map": [],
>>             "versioned_epoch": 2
>>         }
>>     },
>>     {
>>         "type": "instance",
>>         "idx":
>> "�1000_abc_econtract\/data\/6smuz2ysavvxbygng34tgusyse\u0000iRQZUR76UdeymR-PGaw6sbCHMCOcaovu",
>>         "entry": {
>>             "name": "abc_econtract\/data\/6smuz2ysavvxbygng34tgusyse",
>>             "instance": "RQZUR76UdeymR-PGaw6sbCHMCOcaovu",
>>             "ver": {
>>                 "pool": 23,
>>                 "epoch": 9680
>>             },
>>             "locator": "",
>>             "exists": "true",
>>             "meta": {
>>                 "category": 1,
>>                 "size": 103410,
>>                 "mtime": "2015-08-07 17:57:32.000000Z",
>>                 "etag": "6c67f5e6cb4aa63f4fa26a3b94d19d3a",
>>                 "owner": "aws-cmis-prod",
>>                 "owner_display_name": "AWS-CMIS prod user",
>>                 "content_type": "application\/pdf",
>>                 "accounted_size": 103410
>>             },
>>             "tag": "be-east.34319.4520377",
>>             "flags": 3,
>>             "pending_map": [],
>>             "versioned_epoch": 2
>>         }
>>     },
>>     {
>>         "type": "olh",
>>         "idx": "�1001_abc_econtract\/data\/6smuz2ysavvxbygng34tgusyse",
>>         "entry": {
>>             "key": {
>>                 "name": "abc_econtract\/data\/6smuz2ysavvxbygng34tgusyse",
>>                 "instance": "RQZUR76UdeymR-PGaw6sbCHMCOcaovu"
>>             },
>>             "delete_marker": "false",
>>             "epoch": 2,
>>             "pending_log": [],
>>             "tag": "3ejreihlq1045d212goxvdlry31nbdde",
>>             "exists": "true",
>>             "pending_removal": "false"
>>         }
>>     }
>>
>> ]
>>>
>>> Thanks,
>>> Yehuda
>> much appreciating the care...
>> Sam
>>> On Tue, Sep 1, 2015 at 10:44 AM, Sam Wouters <sam@xxxxxxxxx> wrote:
>>>> not sure where I can find the logs for the bucket check, I can't really
>>>> filter them out in the radosgw log.
>>>>
>>>> -Sam
>>>>
>>>> On 01-09-15 19:25, Sam Wouters wrote:
>>>>> It looks like it, this is what shows in the logs after bumping the debug
>>>>> and requesting a bucket list.
>>>>>
>>>>> 2015-09-01 17:14:53.008620 7fccb17ca700 10 cls_bucket_list
>>>>> aws-cmis-prod(@{i=.be-east.rgw.buckets.index}.be-east.rgw.buckets[be-east.5436.1])
>>>>> start
>>>>> abc_econtract/data/6shflrwbwwcm6dsemrpjit2li3v913iad1EZQ3.S6Prb-NXLvfQRlaWC5nBYp5[]
>>>>> num_entries 1
>>>>> 2015-09-01 17:14:53.008629 7fccb17ca700 20 reading from
>>>>> .be-east.rgw:.bucket.meta.aws-cmis-prod:be-east.5436.1
>>>>> 2015-09-01 17:14:53.008636 7fccb17ca700 20 get_obj_state:
>>>>> rctx=0x7fccb17c84d0
>>>>> obj=.be-east.rgw:.bucket.meta.aws-cmis-prod:be-east.5436.1
>>>>> state=0x7fcde01a4060 s->prefetch_data=0
>>>>> 2015-09-01 17:14:53.008640 7fccb17ca700 10 cache get:
>>>>> name=.be-east.rgw+.bucket.meta.aws-cmis-prod:be-east.5436.1 : hit
>>>>> 2015-09-01 17:14:53.008645 7fccb17ca700 20 get_obj_state: s->obj_tag was
>>>>> set empty
>>>>> 2015-09-01 17:14:53.008647 7fccb17ca700 10 cache get:
>>>>> name=.be-east.rgw+.bucket.meta.aws-cmis-prod:be-east.5436.1 : hit
>>>>> 2015-09-01 17:14:53.008675 7fccb17ca700  1 -- 10.11.4.105:0/1109243 -->
>>>>> 10.11.4.105:6801/39085 -- osd_op(client.55506.0:435874
>>>>> ...
>>>>> .dir.be-east.5436.1 [call rgw.bucket_list] 26.7d78fc84
>>>>> ack+read+known_if_redirected e255) v5 -- ?+0 0x7fcde01a0540 con 0x3a2d870
>>>>>
>>>>> On 01-09-15 17:11, Yehuda Sadeh-Weinraub wrote:
>>>>>> Can you bump up debug (debug rgw = 20, debug ms = 1), and see if the
>>>>>> operations (bucket listing and bucket check) go into some kind of
>>>>>> infinite loop?
>>>>>>
>>>>>> Yehuda
>>>>>>
>>>>>> On Tue, Sep 1, 2015 at 1:16 AM, Sam Wouters <sam@xxxxxxxxx> wrote:
>>>>>>> Hi, I've started the bucket --check --fix on friday evening and it's
>>>>>>> still running. 'ceph -s' shows the cluster health as OK, I don't know if
>>>>>>> there is anything else I could check? Is there a way of finding out if
>>>>>>> its actually doing something?
>>>>>>>
>>>>>>> We only have this issue on the one bucket with versioning enabled, I
>>>>>>> can't get rid of the feeling it has something todo with that. The
>>>>>>> "underscore bug" is also still present on that bucket
>>>>>>> (http://tracker.ceph.com/issues/12819). Not sure if thats related in any
>>>>>>> way.
>>>>>>> Are there any alternatives, as for example copy all the objects into a
>>>>>>> new bucket without versioning? Simple way would be to list the objects
>>>>>>> and copy them to a new bucket, but bucket listing is not working so...
>>>>>>>
>>>>>>> -Sam
>>>>>>>
>>>>>>>
>>>>>>> On 31-08-15 10:47, Gregory Farnum wrote:
>>>>>>>> This generally shouldn't be a problem at your bucket sizes. Have you
>>>>>>>> checked that the cluster is actually in a healthy state? The sleeping
>>>>>>>> locks are normal but should be getting woken up; if they aren't it
>>>>>>>> means the object access isn't working for some reason. A down PG or
>>>>>>>> something would be the simplest explanation.
>>>>>>>> -Greg
>>>>>>>>
>>>>>>>> On Fri, Aug 28, 2015 at 6:52 PM, Sam Wouters <sam@xxxxxxxxx> wrote:
>>>>>>>>> Ok, maybe I'm to impatient. It would be great if there were some verbose
>>>>>>>>> or progress logging of the radosgw-admin tool.
>>>>>>>>> I will start a check and let it run over the weekend.
>>>>>>>>>
>>>>>>>>> tnx,
>>>>>>>>> Sam
>>>>>>>>>
>>>>>>>>> On 28-08-15 18:16, Sam Wouters wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> this bucket only has 13389 objects, so the index size shouldn't be a
>>>>>>>>>> problem. Also, on the same cluster we have an other bucket with 1200543
>>>>>>>>>> objects (but no versioning configured), which has no issues.
>>>>>>>>>>
>>>>>>>>>> when we run a radosgw-admin bucket --check (--fix), nothing seems to be
>>>>>>>>>> happening. Putting an strace on the process shows a lot of lines like these:
>>>>>>>>>> [pid 99372] futex(0x2d730d4, FUTEX_WAIT_PRIVATE, 156619, NULL
>>>>>>>>>> <unfinished ...>
>>>>>>>>>> [pid 99385] futex(0x2da9410, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
>>>>>>>>>> [pid 99371] futex(0x2da9410, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
>>>>>>>>>> [pid 99385] <... futex resumed> )       = -1 EAGAIN (Resource
>>>>>>>>>> temporarily unavailable)
>>>>>>>>>> [pid 99371] <... futex resumed> )       = 0
>>>>>>>>>>
>>>>>>>>>> but no errors in the ceph logs or health warnings.
>>>>>>>>>>
>>>>>>>>>> r,
>>>>>>>>>> Sam
>>>>>>>>>>
>>>>>>>>>> On 28-08-15 17:49, Ben Hines wrote:
>>>>>>>>>>> How many objects in the bucket?
>>>>>>>>>>>
>>>>>>>>>>> RGW has problems with index size once number of objects gets into the
>>>>>>>>>>> 900000+ level. The buckets need to be recreated with 'sharded bucket
>>>>>>>>>>> indexes' on:
>>>>>>>>>>>
>>>>>>>>>>> rgw override bucket index max shards = 23
>>>>>>>>>>>
>>>>>>>>>>> You could also try repairing the index with:
>>>>>>>>>>>
>>>>>>>>>>>  radosgw-admin bucket check --fix --bucket=<bucketname>
>>>>>>>>>>>
>>>>>>>>>>> -Ben
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Aug 28, 2015 at 8:38 AM, Sam Wouters <sam@xxxxxxxxx> wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> we have a rgw bucket (with versioning) where PUT and GET operations for
>>>>>>>>>>>> specific objects succeed,  but retrieving an object list fails.
>>>>>>>>>>>> Using python-boto, after a timeout just gives us an 500 internal error;
>>>>>>>>>>>> radosgw-admin just hangs.
>>>>>>>>>>>> Also a radosgw-admin bucket check just seems to hang...
>>>>>>>>>>>>
>>>>>>>>>>>> ceph version is 0.94.3 but this also was happening with 0.94.2, we
>>>>>>>>>>>> quietly hoped upgrading would fix but it didn't...
>>>>>>>>>>>>
>>>>>>>>>>>> r,
>>>>>>>>>>>> Sam
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> ceph-users mailing list
>>>>>>>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>>>>> _______________________________________________
>>>>>>>>>> ceph-users mailing list
>>>>>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>>>> _______________________________________________
>>>>>>>>> ceph-users mailing list
>>>>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>> _______________________________________________
>>>>>>> ceph-users mailing list
>>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com