Re: mds getattr locked again

Leander Yu <leander.yu@xxxxxxxxx> · Sat, 30 Oct 2010 10:17:09 +0800

On Oct 30, 2010, at 12:26 AM, Sage Weil <sage@xxxxxxxxxxxx> wrote:

> On Fri, 29 Oct 2010, Leander Yu wrote:
>> Hi Sage,
>>
>> On Fri, Oct 29, 2010 at 1:08 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
>>> On Fri, 29 Oct 2010, Henry C Chang wrote:
>>>> Hi,
>>>>
>>>> getattr on mds hanged again.
>>>>
>>>> I have already reverted d91f2438d881514e4a923fd786dbd94b764a9440.
>>>> Although the probability is significant lowered down, it still has the
>>>> chance to hang on getattr.
>>>>
>>>> Attached are the logs of mds and the hanging client. :(
>>>>
>>>> I'm using ceph-client-standalone master-backport branch on 2.6.32 kernel.
>>>
>>> It looks like ceph_check_caps is hung somehow:
>>>
>>> ceph:ceph:  handle_caps from mds0
>>> ceph:ceph:   mds0 seq 99 cap seq 28
>>> ceph:ceph:   op revoke ino 10000000bd1.fffffffffffffffe inode
>>> ffff8800a6251d88
>>> ceph:ceph:  handle_cap_grant inode ffff8800a6251d88 cap ffff8800a635b780
>>> mds0 seq 28 pAsLsXsFr
>>> ceph:ceph:   size 4294967296 max_size 8594128896, i_size 4294967296
>>> ceph:ceph:  try_nonblocking_invalidate ffff8800a6251d88 success
>>> ceph:ceph:  __ceph_caps_issued ffff8800a6251d88 cap ffff8800a635b780
>>> issued pAsLsXsFscr
>>> ceph:ceph:  __ceph_caps_issued ffff8800a6251d88 cap ffff8800a635b780
>>> issued pAsLsXsFscr
>>> ceph:ceph:  ffff8800a6251d88 mode 0100644 uid.gid 0.0
>>> ceph:ceph:   my wanted = pAsxXsxFsxcrwb, used = pFcr, dirty -
>>> ceph:ceph:  revocation: pAsLsXsFscr -> pAsLsXsFr (revoking Fsc)
>>> ceph:ceph:  __ceph_caps_issued ffff8800a6251d88 cap ffff8800a635b780
>>> issued pAsLsXsFr
>>> ceph:ceph:  check_caps ffff8800a6251d88 file_want pAsxXsxFsxcrwb used pFcr
>>> dirty - flushing - issued pAsLsXsFr revoking Fsc retain pAsxLsxXsxFsxcrwbl
>>> AUTHONLY NODELAY
>>> ceph:ceph:   mds0 revoking Fsc
>>> ceph:ceph:  mdsc put_session ffff8800b41c6000 3 -> 2
>>> ceph:ceph:  mdsc con_put ffff8800b41c6000 (2)
>>> ceph:ceph:  aio_read ffff8800a6251d88 10000000bd1.fffffffffffffffe
>>> dropping cap refs on Fcr = 512
>>> ceph:ceph:  put_cap_refs ffff8800a6251d88 had Fcr last
>>> ceph:ceph:  __ceph_caps_issued ffff8800a6251d88 cap ffff8800a635b780
>>> issued pAsLsXsFr
>>> ceph:ceph:  check_caps ffff8800a6251d88 file_want pAsxXsxFsxcrwb used pFc
>>> dirty - flushing - issued pAsLsXsFr revoking Fsc retain pAsxLsxXsxFsxcrwbl
>>> ceph:ceph:  check_caps trying to invalidate on ffff8800a6251d88
>>> ceph:ceph:  try_nonblocking_invalidate ffff8800a6251d88 failed
>>> ceph:ceph:  check_caps queuing invalidate
>>>
>>> --> this means queue_invalidate = 1, and check_caps will call
>>> ceph_queue_invalidate on exit, which will always print something...
>>>
>>> ceph:ceph:  __ceph_caps_issued ffff8800a6251d88 cap ffff8800a635b780 issued pAsLsXsFr
>>> ceph:ceph:  check_caps ffff8800a6251d88 file_want pAsxXsxFsxcrwb used pFc
>>> dirty - flushing - issued pAsLsXsFr revoking Fsc retain pAsxLsxXsxFsxcrwbl
>>> ceph:ceph:   mds0 revoking Fsc
>>> ceph:ceph:  __cap_delay_cancel ffff8800a6251d88
>>>
>>> ...but that never happens.  Probably the CPU got blocked somewhere?  Can
>>> you see what the system is doing at this point?  sysrq-t, or check the
>>> process list for ceph-msgr and cat it's stack (/proc/$pid/stack)?  The
>>> task should be blocked in ceph_check_caps() somewhere...
>>
>> it didn't go to ceph_queue_invalidate because delayed and is_delayed = 0.
>>
>>         if (delayed && is_delayed)
>>                 force_requeue = 1;   /* __send_cap delayed release; requeue */
>>         if (!delayed && !is_delayed)
>>                 __cap_delay_cancel(mdsc, ci);
>>         else if (!is_delayed || force_requeue)
>>                 __cap_delay_requeue(mdsc, ci);
>>
>>         spin_unlock(&inode->i_lock);
>>
>>         if (queue_invalidate)
>>                 ceph_queue_invalidate(inode);
>>
>> so it goes to __cap_delay_cancel().
>>
>> Should I said, the correct behavior is to go to ceph_queue_invalidate(inode)?
>
> I don't think delayed && is_delayed are related.  (Those deal with whether
> we come back and re-check caps a little latter.)  The invalidate should be
> queued immediately: at this point queue_invalidate == 1, and the function
> only has one exit point, so one way or another ceph_queue_invalidate()
> should get called.  Since it's not, we must be getting stuck somewhere...
>
> Some combination of sysrq and strategic printk's should narrow down where
> things go wrong.
>
> Thanks!
> sage

>From the kernel debug log we found it went to __cap_delay_cancel()
then didn't go to ceph_queue_invalidate(inode).  add one more checking
condition as following workaround it

 if (!delayed && !is_delayed && !queue_invalidate)
                 __cap_delay_cancel(mdsc, ci);

But you are right, no matter what, it should goes to
ceph_queue_invalidate() after return from __cap_delay_cancel().
We will add some more printk to see where it got stuck. The most
suspicious part is spin_lock(&mdsc->cap_delay_lock) inside
__cap_delay_cancel()

>
>
>>
>>
>>>
>>> (BTW, if you're building your own kernel, one thing that I've found
>>> helpful is enabling the CONFIG_PRINTK_TIME option in .config, and updating
>>> kernel/printk.c to also include current->pid in the line prefix.  That
>>> helps sort out what tasks are doing what when.  But if you're stuck on
>>> 2.6.32 for some reason that probably not the case!)
>>>
>>> Thanks!
>>> sage
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html