Re: which kernel version can help avoid kernel client deadlock

van <chaofanyu@xxxxxxxxxxx> · Tue, 28 Jul 2015 16:19:35 +0800

Hi, Ilya,

  Thanks for your quick reply.

  Here is the link http://ceph.com/docs/cuttlefish/faq/  , under the "HOW CAN I GIVE CEPH A TRY?” section which talk about the old kernel stuff.

  By the way, what’s the main reason of using kernel 4.1, is there a lot of critical bugs fixed in that version despite perf improvements?
  I am worrying kernel 4.1 is too new that may introduce other problems.
  And if I’m using the librdb API, is the kernel version matters?

  In my tests, I built a 2-nodes cluster, each with only one OSD with os centos 7.1, kernel version 3.10.0.229 and ceph v0.94.2.
  I created several rbds and mkfs.xfs on those rbds to create filesystems. (kernel client were running on the ceph cluster)
  I performed heavy IO tests on those filesystems and found some fio got hung and turned into D state forever (uninterruptible sleep).
  I suspect it’s the deadlock that make the fio process hung.
  However the ceph-osd are stil responsive, and I can operate rbd via librbd API.
  Does this mean it’s not the loopback mount deadlock that cause the fio process hung?
  Or it is also a deadlock phnonmenon, only one thread is blocked in memory allocation and other threads are still possible to receive API requests, so the ceph-osd are still responsive?

  What worth mentioning is that after I restart the ceph-osd daemon, all processes in D state come back into normal state.

  Below is related log in kernel:

Jul  7 02:25:39 node0 kernel: INFO: task xfsaild/rbd1:24795 blocked for more than 120 seconds.
Jul  7 02:25:39 node0 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul  7 02:25:39 node0 kernel: xfsaild/rbd1    D ffff880c2fc13680     0 24795      2 0x00000080
Jul  7 02:25:39 node0 kernel: ffff8801d6343d40 0000000000000046 ffff8801d6343fd8 0000000000013680
Jul  7 02:25:39 node0 kernel: ffff8801d6343fd8 0000000000013680 ffff880c0c0b0000 ffff880c0c0b0000
Jul  7 02:25:39 node0 kernel: ffff880c2fc14340 0000000000000001 0000000000000000 ffff8805bace2528
Jul  7 02:25:39 node0 kernel: Call Trace:
Jul  7 02:25:39 node0 kernel: [<ffffffff81609e39>] schedule+0x29/0x70
Jul  7 02:25:39 node0 kernel: [<ffffffffa03a1890>] _xfs_log_force+0x230/0x290 [xfs]
Jul  7 02:25:39 node0 kernel: [<ffffffff810a9620>] ? wake_up_state+0x20/0x20
Jul  7 02:25:39 node0 kernel: [<ffffffffa03a1916>] xfs_log_force+0x26/0x80 [xfs]
Jul  7 02:25:39 node0 kernel: [<ffffffffa03a6390>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
Jul  7 02:25:39 node0 kernel: [<ffffffffa03a64e1>] xfsaild+0x151/0x5e0 [xfs]
Jul  7 02:25:39 node0 kernel: [<ffffffffa03a6390>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
Jul  7 02:25:39 node0 kernel: [<ffffffff8109739f>] kthread+0xcf/0xe0
Jul  7 02:25:39 node0 kernel: [<ffffffff810972d0>] ? kthread_create_on_node+0x140/0x140
Jul  7 02:25:39 node0 kernel: [<ffffffff8161497c>] ret_from_fork+0x7c/0xb0
Jul  7 02:25:39 node0 kernel: [<ffffffff810972d0>] ? kthread_create_on_node+0x140/0x140
Jul  7 02:25:39 node0 kernel: INFO: task xfsaild/rbd5:2914 blocked for more than 120 seconds.

  Does anyone encounter the same problem or could help with this?

  Thanks. 

On Jul 28, 2015, at 3:01 PM, Ilya Dryomov <idryomov@xxxxxxxxx> wrote:

On Tue, Jul 28, 2015 at 9:17 AM, van <chaofanyu@xxxxxxxxxxx> wrote:
Hi, list,

  I found on the ceph FAQ that, ceph kernel client should not run on
machines belong to ceph cluster.
  As ceph FAQ metioned, “In older kernels, Ceph can deadlock if you try to
mount CephFS or RBD client services on the same host that runs your test
Ceph cluster. This is not a Ceph-related issue.”
  Here it says that there will be deadlock if using old kernel version.
  I wonder if anyone knows which new kernel version solve this loopback
mount deadlock.
  It will be a great help since I do need to use rbd kernel client on the
ceph cluster.

Note that doing this is *not* recommended.  That said, if you don't
push your system to its knees too hard, it should work.  I'm not sure
what exactly constitutes and older kernel as per that FAQ (as you
haven't even linked it), but even if I knew, I'd still suggest 4.1.

  As I search more informations, I found two articals
https://lwn.net/Articles/595652/ and https://lwn.net/Articles/596618/  talk
about supporting nfs loopback mount，it seems they do effort not on memory
management only, but also on nfs related codes, I wonder if ceph has also so
some effort on kernel client to solve this problem. If ceph did, could
anyone help provide the kernel version with the patch?

There wasn't any specific effort on the ceph side, but we do try not to
break it: sometime around 3.18 a ceph patch was merged that made it
impossible to do co-locate kernel client with OSDs; once we realized
that, the culprit patch was reverted and the revert was backported.

So the bottom line is we don't recommend it, but we try not to break
your ability to do it ;)

Thanks,

                Ilya

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com