Re: corrupted rbd filesystems since jewel

Stefan Priebe - Profihost AG <s.priebe@xxxxxxxxxxxx> · Wed, 17 May 2017 16:25:28 +0200

I can set the following debug levels for osd 46:

ceph --admin-daemon /var/run/ceph/ceph-osd.$1.asok config set debug_osd 20
ceph --admin-daemon /var/run/ceph/ceph-osd.$1.asok config set
debug_filestore 20
ceph --admin-daemon /var/run/ceph/ceph-osd.$1.asok config set debug_ms 1
ceph --admin-daemon /var/run/ceph/ceph-osd.$1.asok config set
debug_optracker 20

issue the delete request and send you the log?

Greets,
Stefan

Am 17.05.2017 um 16:07 schrieb Jason Dillaman:
> The internals of the OSD is not exactly my area of expertise. Since
> you have the op tracker disabled and I'm assuming your cluster health
> is OK, perhaps you could run "gcore <osd 46 pid>" to preserve its
> current state at a bare minimum. Then, assuming your can restart vm191
> and re-issue the fstrim that blocks, perhaps you could increase the
> debug log level of OSD 46 and see if you can capture the remove
> request for rbd_data.e10ca56b8b4567.000000000000311c.
> 
> On Wed, May 17, 2017 at 9:44 AM, Stefan Priebe - Profihost AG
> <s.priebe@xxxxxxxxxxxx> wrote:
>> Hello Jason,
>>
>> any idea how to debug this further? dmesg does not show any disk
>> failures. Smart values are also OK. There's also no xfs BUG or WARNING
>> from the kernel side.
>>
>> I'm sure that it will work after restarting osd.46 - but i'm loosing the
>> ability to reproduce this in that case.
>>
>> Should i install debug symbols and issue a gdb: "thread apply all bt full"?
>>
>> Does it help?
>>
>> Greets,
>> Stefan
>>
>> Am 17.05.2017 um 15:12 schrieb Jason Dillaman:
>>> Perfect librbd log capture. I can see that a remove request to object
>>> rbd_data.e10ca56b8b4567.000000000000311c was issued but it never
>>> completed. This results in a hung discard and flush request.
>>>
>>> Assuming that object is still mapped to OSD 46, I think there is
>>> either something happening with that OSD or its backing disk that is
>>> preventing the remove for that object from completing.
>>>
>>> On Wed, May 17, 2017 at 6:04 AM, Stefan Priebe - Profihost AG
>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>> Hello Jason,
>>>>
>>>> attached is a rbd debug 20 log from the situation when the deadlock
>>>> occours. I hope this helps.
>>>>
>>>> The VM was running until 2017-05-17 12:10 but there was no I/O for 10 min.
>>>>
>>>> Greets,
>>>> Stefan
>>>>
>>>> Am 16.05.2017 um 22:54 schrieb Jason Dillaman:
>>>>> It looks like it's just a ping message in that capture.
>>>>>
>>>>> Are you saying that you restarted OSD 46 and the problem persisted?
>>>>>
>>>>> On Tue, May 16, 2017 at 4:02 PM, Stefan Priebe - Profihost AG
>>>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>>>> Hello,
>>>>>>
>>>>>> while reproducing the problem, objecter_requests looks like this:
>>>>>>
>>>>>> {
>>>>>>     "ops": [
>>>>>>         {
>>>>>>             "tid": 42029,
>>>>>>             "pg": "5.bd9616ad",
>>>>>>             "osd": 46,
>>>>>>             "object_id": "rbd_data.e10ca56b8b4567.000000000000311c",
>>>>>>             "object_locator": "@5",
>>>>>>             "target_object_id": "rbd_data.e10ca56b8b4567.000000000000311c",
>>>>>>             "target_object_locator": "@5",
>>>>>>             "paused": 0,
>>>>>>             "used_replica": 0,
>>>>>>             "precalc_pgid": 0,
>>>>>>             "last_sent": "2.28854e+06s",
>>>>>>             "attempts": 1,
>>>>>>             "snapid": "head",
>>>>>>             "snap_context": "a07c2=[]",
>>>>>>             "mtime": "2017-05-16 21:53:22.0.069541s",
>>>>>>             "osd_ops": [
>>>>>>                 "delete"
>>>>>>             ]
>>>>>>         }
>>>>>>     ],
>>>>>>     "linger_ops": [
>>>>>>         {
>>>>>>             "linger_id": 1,
>>>>>>             "pg": "5.5f3bd635",
>>>>>>             "osd": 17,
>>>>>>             "object_id": "rbd_header.e10ca56b8b4567",
>>>>>>             "object_locator": "@5",
>>>>>>             "target_object_id": "rbd_header.e10ca56b8b4567",
>>>>>>             "target_object_locator": "@5",
>>>>>>             "paused": 0,
>>>>>>             "used_replica": 0,
>>>>>>             "precalc_pgid": 0,
>>>>>>             "snapid": "head",
>>>>>>             "registered": "1"
>>>>>>         }
>>>>>>     ],
>>>>>>     "pool_ops": [],
>>>>>>     "pool_stat_ops": [],
>>>>>>     "statfs_ops": [],
>>>>>>     "command_ops": []
>>>>>> }
>>>>>>
>>>>>> Yes they've an established TCP connection. Qemu <=> osd.46. Attached is
>>>>>> a pcap file of the traffic between them when it got stuck.
>>>>>>
>>>>>> Greets,
>>>>>> Stefan
>>>>>>
>>>>>> Am 16.05.2017 um 21:45 schrieb Jason Dillaman:
>>>>>>> On Tue, May 16, 2017 at 3:37 PM, Stefan Priebe - Profihost AG
>>>>>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>>>>>> We've enabled the op tracker for performance reasons while using SSD
>>>>>>>> only storage ;-(
>>>>>>>
>>>>>>> Disabled you mean?
>>>>>>>
>>>>>>>> Can enable the op tracker using ceph osd tell? Than reproduce the
>>>>>>>> problem. Check what has stucked again? Or should i generate an rbd log
>>>>>>>> from the client?
>>>>>>>
>>>>>>> From a super-quick glance at the code, it looks like that isn't a
>>>>>>> dynamic setting. Of course, it's possible that if you restart OSD 46
>>>>>>> to enable the op tracker, the stuck op will clear itself and the VM
>>>>>>> will resume. You could attempt to generate a gcore of OSD 46 to see if
>>>>>>> information on that op could be extracted via the debugger, but no
>>>>>>> guarantees.
>>>>>>>
>>>>>>> You might want to verify that the stuck client and OSD 46 have an
>>>>>>> actual established TCP connection as well before doing any further
>>>>>>> actions.
>>>>>>>
>>>>>
>>>>>
>>>>>
>>>
>>>
>>>
> 
> 
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com