Re: [PATCH 1/3] libceph: call r_unsafe_callback when unsafe reply is received

Milosz Tanski <milosz@xxxxxxxxx> · Thu, 25 Jul 2013 11:43:33 -0400

I just wanted to follow up to say that after applying these patches
and running it for a few weeks we're I haven't seen another lock up
under load.

- Milosz

On Mon, Jul 8, 2013 at 5:16 PM, Milosz Tanski <milosz@xxxxxxxxx> wrote:
> In this case (unlike last week) the restart did unlock my clients.
>
> - M
>
> On Mon, Jul 8, 2013 at 4:30 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote:
>> On Tue, Jul 9, 2013 at 3:58 AM, Milosz Tanski <milosz@xxxxxxxxx> wrote:
>>> Yan,
>>>
>>> Actually after playing some more today I have another one of my
>>> clients stuck in this spot. When I look at the kernel stacks this is
>>> what I see for all the threads:
>>>
>>> [<ffffffffa02d2bab>] ceph_mdsc_do_request+0xcb/0x1a0 [ceph]
>>> [<ffffffffa02c018f>] ceph_do_getattr+0xdf/0x120 [ceph]
>>> [<ffffffffa02c01f4>] ceph_getattr+0x24/0x100 [ceph]
>>> [<ffffffff811775fd>] vfs_getattr+0x4d/0x80
>>> [<ffffffff8117784d>] vfs_fstat+0x3d/0x70
>>> [<ffffffff81177895>] SYSC_newfstat+0x15/0x30
>>> [<ffffffff8117794e>] SyS_newfstat+0xe/0x10
>>> [<ffffffff8155dd59>] system_call_fastpath+0x16/0x1b
>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>
>>>
>>> Anything I can do on my end to debug this issue?
>>>
>>
>> find the hang request (and inode) through /sys/kernel/debug/ceph/xxx/mdsc,
>> use 'ceph mds tell \* dumpcache' to dump mds cache.
>> open /cachedump.xxx and check the inode's state.
>>
>> does your kernel include all fixes in testing branch of ceph-client ?
>> does restarting the mds resolve the hang ?
>>
>> yan, zheng
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html