Re: [PATCH 1/3] libceph: call r_unsafe_callback when unsafe reply is received

Milosz Tanski <milosz@xxxxxxxxx> · Mon, 8 Jul 2013 15:58:15 -0400

Yan,

Actually after playing some more today I have another one of my
clients stuck in this spot. When I look at the kernel stacks this is
what I see for all the threads:

[<ffffffffa02d2bab>] ceph_mdsc_do_request+0xcb/0x1a0 [ceph]
[<ffffffffa02c018f>] ceph_do_getattr+0xdf/0x120 [ceph]
[<ffffffffa02c01f4>] ceph_getattr+0x24/0x100 [ceph]
[<ffffffff811775fd>] vfs_getattr+0x4d/0x80
[<ffffffff8117784d>] vfs_fstat+0x3d/0x70
[<ffffffff81177895>] SYSC_newfstat+0x15/0x30
[<ffffffff8117794e>] SyS_newfstat+0xe/0x10
[<ffffffff8155dd59>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

Anything I can do on my end to debug this issue?

- Milosz

P.S: Sorry for the second email if you got it gmail keep switching me
to non-plain text mode. Sigh.

On Mon, Jul 8, 2013 at 10:42 AM, Milosz Tanski <milosz@xxxxxxxxx> wrote:
> Yan,
>
> So it looks like it fixes the issue. I had to update all my clients
> and restart MDS and things got back to normal.
>
> - Milosz
>
> On Wed, Jul 3, 2013 at 6:43 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote:
>> On Thu, Jul 4, 2013 at 6:07 AM, Milosz Tanski <milosz@xxxxxxxxx> wrote:
>>> Yan,
>>>
>>> Can you help me understand how this change fixes:
>>> http://tracker.ceph.com/issues/2019 ? The symptom on the client is
>>> that the processes get stuck waiting in ceph_mdsc_do_request according
>>> to /proc/PID/stack.
>>
>> The bug this patch fixes is that ceph_sync_write_unsafe can be called
>> multiple times
>> with parameter unsafe=true. The bug prevents the kclient from
>> releasing Fw cap, further
>> lead to filelock stuck in unstable state forever and request hang.
>>
>> Yan, Zheng
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html