Re: which kernel version can help avoid kernel client deadlock

van <chaofanyu@xxxxxxxxxxx> · Tue, 28 Jul 2015 19:46:06 +0800

Hi, Ilya,

  In the dmesg, there is also a lot of libceph socket error, which I think may be caused by my stopping ceph service without unmap rbd.

  Here is a more than 10000 lines log contains more info, http://jmp.sh/NcokrfT 

  Thanks for willing to help.

van
chaofanyu@xxxxxxxxxxx

On Jul 28, 2015, at 7:11 PM, Ilya Dryomov <idryomov@xxxxxxxxx> wrote:

On Tue, Jul 28, 2015 at 11:19 AM, van <chaofanyu@xxxxxxxxxxx> wrote:
Hi, Ilya,

 Thanks for your quick reply.

 Here is the link http://ceph.com/docs/cuttlefish/faq/  , under the "HOW
CAN I GIVE CEPH A TRY?” section which talk about the old kernel stuff.

 By the way, what’s the main reason of using kernel 4.1, is there a lot of
critical bugs fixed in that version despite perf improvements?
 I am worrying kernel 4.1 is too new that may introduce other problems.

Well, I'm not sure what exactly is in 3.10.0.229, so I can't tell you
off hand.  I can think of one important memory pressure related fix
that's probably not in there.

I'm suggesting the latest stable version of 4.1 (currently 4.1.3),
because if you hit a deadlock (remember, this is a configuration that
is neither recommended nor guaranteed to work), it'll be easier to
debug and fix if the fix turns out to be worth it.

If 4.1 is not acceptable for you, try the latest stable version of 3.18
(that is 3.18.19).  It's an LTS kernel, so that should mitigate some of
your concerns.

 And if I’m using the librdb API, is the kernel version matters?

No, not so much.

 In my tests, I built a 2-nodes cluster, each with only one OSD with os
centos 7.1, kernel version 3.10.0.229 and ceph v0.94.2.
 I created several rbds and mkfs.xfs on those rbds to create filesystems.
(kernel client were running on the ceph cluster)
 I performed heavy IO tests on those filesystems and found some fio got
hung and turned into D state forever (uninterruptible sleep).
 I suspect it’s the deadlock that make the fio process hung.
 However the ceph-osd are stil responsive, and I can operate rbd via librbd
API.
 Does this mean it’s not the loopback mount deadlock that cause the fio
process hung?
 Or it is also a deadlock phnonmenon, only one thread is blocked in memory
allocation and other threads are still possible to receive API requests, so
the ceph-osd are still responsive?

 What worth mentioning is that after I restart the ceph-osd daemon, all
processes in D state come back into normal state.

 Below is related log in kernel:

Jul  7 02:25:39 node0 kernel: INFO: task xfsaild/rbd1:24795 blocked for more
than 120 seconds.
Jul  7 02:25:39 node0 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul  7 02:25:39 node0 kernel: xfsaild/rbd1    D ffff880c2fc13680     0 24795
2 0x00000080
Jul  7 02:25:39 node0 kernel: ffff8801d6343d40 0000000000000046
ffff8801d6343fd8 0000000000013680
Jul  7 02:25:39 node0 kernel: ffff8801d6343fd8 0000000000013680
ffff880c0c0b0000 ffff880c0c0b0000
Jul  7 02:25:39 node0 kernel: ffff880c2fc14340 0000000000000001
0000000000000000 ffff8805bace2528
Jul  7 02:25:39 node0 kernel: Call Trace:
Jul  7 02:25:39 node0 kernel: [<ffffffff81609e39>] schedule+0x29/0x70
Jul  7 02:25:39 node0 kernel: [<ffffffffa03a1890>]
_xfs_log_force+0x230/0x290 [xfs]
Jul  7 02:25:39 node0 kernel: [<ffffffff810a9620>] ? wake_up_state+0x20/0x20
Jul  7 02:25:39 node0 kernel: [<ffffffffa03a1916>] xfs_log_force+0x26/0x80
[xfs]
Jul  7 02:25:39 node0 kernel: [<ffffffffa03a6390>] ?
xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
Jul  7 02:25:39 node0 kernel: [<ffffffffa03a64e1>] xfsaild+0x151/0x5e0 [xfs]
Jul  7 02:25:39 node0 kernel: [<ffffffffa03a6390>] ?
xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
Jul  7 02:25:39 node0 kernel: [<ffffffff8109739f>] kthread+0xcf/0xe0
Jul  7 02:25:39 node0 kernel: [<ffffffff810972d0>] ?
kthread_create_on_node+0x140/0x140
Jul  7 02:25:39 node0 kernel: [<ffffffff8161497c>] ret_from_fork+0x7c/0xb0
Jul  7 02:25:39 node0 kernel: [<ffffffff810972d0>] ?
kthread_create_on_node+0x140/0x140
Jul  7 02:25:39 node0 kernel: INFO: task xfsaild/rbd5:2914 blocked for more
than 120 seconds.

Is that all there is in dmesg?  Can you paste the entire dmesg?

Thanks,

               Ilya

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com