Re: Giant + nfs over cephfs hang tasks

Ilya Dryomov <ilya.dryomov@xxxxxxxxxxx> · Fri, 28 Nov 2014 20:58:41 +0400

On Fri, Nov 28, 2014 at 3:02 PM, Andrei Mikhailovsky <andrei@xxxxxxxxxx> wrote:
> I've just tried the latest ubuntu-vivid kernel and also seeing hang tasks
> with dd tests:
>
>
> [ 3721.026421] INFO: task nfsd:16596 blocked for more than 120 seconds.
> [ 3721.065141]       Not tainted 3.17.4-031704-generic #201411211317
> [ 3721.103721] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [ 3721.180409] nfsd            D 0000000000000009     0 16596      2
> 0x00000000
> [ 3721.180412]  ffff88006f0cbc18 0000000000000046 ffff88006f0cbbc8
> ffffffff8109677f
> [ 3721.180414]  ffff88006f0cbfd8 00000000000145c0 ffff88067089f700
> 00000000000145c0
> [ 3721.180417]  ffff880673dac600 ffff88045adfbc00 ffff8801bdf8be40
> ffff88000b841500
> [ 3721.180420] Call Trace:
> [ 3721.180423]  [<ffffffff8109677f>] ? set_groups+0x2f/0x60
> [ 3721.180427]  [<ffffffff817a20c9>] schedule+0x29/0x70
> [ 3721.180440]  [<ffffffff817a23ee>] schedule_preempt_disabled+0xe/0x10
> [ 3721.180443]  [<ffffffff817a429d>] __mutex_lock_slowpath+0xcd/0x1d0
> [ 3721.180447]  [<ffffffff817a43c3>] mutex_lock+0x23/0x37
> [ 3721.180454]  [<ffffffffc071cadd>] nfsd_setattr+0x15d/0x2a0 [nfsd]
> [ 3721.180460]  [<ffffffffc0727d2e>] nfsd4_setattr+0x14e/0x180 [nfsd]
> [ 3721.180467]  [<ffffffffc0729eac>] nfsd4_proc_compound+0x4cc/0x730 [nfsd]
> [ 3721.180478]  [<ffffffffc0715e55>] nfsd_dispatch+0xe5/0x230 [nfsd]
> [ 3721.180491]  [<ffffffffc05b9882>] ? svc_tcp_adjust_wspace+0x12/0x30
> [sunrpc]
> [ 3721.180500]  [<ffffffffc05b8694>] svc_process_common+0x324/0x680 [sunrpc]
> [ 3721.180510]  [<ffffffffc05b8d43>] svc_process+0x103/0x160 [sunrpc]
> [ 3721.180516]  [<ffffffffc07159c7>] nfsd+0x117/0x190 [nfsd]
> [ 3721.180526]  [<ffffffffc07158b0>] ? nfsd_destroy+0x80/0x80 [nfsd]
> [ 3721.180528]  [<ffffffff81093359>] kthread+0xc9/0xe0
> [ 3721.180533]  [<ffffffff81093290>] ? flush_kthread_worker+0x90/0x90
> [ 3721.180536]  [<ffffffff817a64bc>] ret_from_fork+0x7c/0xb0
> [ 3721.180540]  [<ffffffff81093290>] ? flush_kthread_worker+0x90/0x90
> [ 3721.180577] INFO: task kworker/2:3:28061 blocked for more than 120
> seconds.
> [ 3721.221450]       Not tainted 3.17.4-031704-generic #201411211317
> [ 3721.261440] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [ 3721.341394] kworker/2:3     D 0000000000000002     0 28061      2
> 0x00000000
> [ 3721.341408] Workqueue: ceph-trunc ceph_vmtruncate_work [ceph]
> [ 3721.341409]  ffff8805a6507b08 0000000000000046 ffff8805a6507b88
> ffffea00040e8c80
> [ 3721.341412]  ffff8805a6507fd8 00000000000145c0 ffff88067089b480
> 00000000000145c0
> [ 3721.341414]  ffff8801fffec600 ffff880102535a00 ffff88000b8415a8
> ffff88046fc94ec0
> [ 3721.341417] Call Trace:
> [ 3721.341421]  [<ffffffff817a2970>] ? bit_wait+0x50/0x50
> [ 3721.341424]  [<ffffffff817a20c9>] schedule+0x29/0x70
> [ 3721.341427]  [<ffffffff817a219f>] io_schedule+0x8f/0xd0
> [ 3721.341430]  [<ffffffff817a299b>] bit_wait_io+0x2b/0x50
> [ 3721.341433]  [<ffffffff817a2656>] __wait_on_bit_lock+0x76/0xb0
> [ 3721.341438]  [<ffffffff811756b5>] ? find_get_entries+0xe5/0x160
> [ 3721.341440]  [<ffffffff8117245e>] __lock_page+0xae/0xb0
> [ 3721.341446]  [<ffffffff810b3fd0>] ? wake_atomic_t_function+0x40/0x40
> [ 3721.341451]  [<ffffffff81183226>] truncate_inode_pages_range+0x446/0x700
> [ 3721.341455]  [<ffffffff81183565>] truncate_inode_pages+0x15/0x20
> [ 3721.341457]  [<ffffffff811835bc>] truncate_pagecache+0x4c/0x70
> [ 3721.341464]  [<ffffffffc09f815e>] __ceph_do_pending_vmtruncate+0xde/0x230
> [ceph]
> [ 3721.341470]  [<ffffffffc09f8c73>] ceph_vmtruncate_work+0x23/0x50 [ceph]
> [ 3721.341476]  [<ffffffff8108cece>] process_one_work+0x14e/0x460
> [ 3721.341479]  [<ffffffff8108d84b>] worker_thread+0x11b/0x3f0
> [ 3721.341482]  [<ffffffff8108d730>] ? create_worker+0x1e0/0x1e0
> [ 3721.341485]  [<ffffffff81093359>] kthread+0xc9/0xe0
> [ 3721.341487]  [<ffffffff81093290>] ? flush_kthread_worker+0x90/0x90
> [ 3721.341490]  [<ffffffff817a64bc>] ret_from_fork+0x7c/0xb0
> [ 3721.341492]  [<ffffffff81093290>] ? flush_kthread_worker+0x90/0x90
>
>
> They do not happen with every dd test, but happen pretty often. Especially
> when I am running a few dd tests concurrently.
>
> An example test that generated hang tasks above after just 2 runs:
>
> # dd if=/dev/zero of=/tmp/cephfs/4G bs=1M count=4K oflag=direct & dd
> if=/dev/zero of=/tmp/cephfs/4G1 bs=1M count=4K oflag=direct & dd
> if=/dev/zero of=/tmp/cephfs/4G2 bs=1M count=4K oflag=direct & dd
> if=/dev/zero of=/tmp/cephfs/4G3 bs=1M count=4K oflag=direct &
>
> Cheers
>
> ________________________________
>
> From: "Andrei Mikhailovsky" <andrei@xxxxxxxxxx>
> To: "ceph-users" <ceph-users@xxxxxxxxxxxxxx>
> Sent: Friday, 28 November, 2014 11:22:07 AM
> Subject: Re:  Giant + nfs over cephfs hang tasks
>
>
> I am also noticing some delays working with nfs over cephfs. Especially when
> making an initial connection. For instance, I run the following:
>
> # time for i in {0..10} ; do time touch /tmp/cephfs/test-$i ; done
>
> where /tmp/cephfs is the nfs mount point running over cephfs
>
> I am noticing that the first touch file is created after about 20-30
> seconds. All the following files files are created with no delay.
>
> If I run the command once again, all files are created pretty quickly
> without any delay. However, if I wait 20-30 minutes and run the command
> again, the delay with the first file is back again.
>
> Has anyone experienced similar issues?
>
> Andrei
> ________________________________
>
> From: "Andrei Mikhailovsky" <andrei@xxxxxxxxxx>
> To: "ceph-users" <ceph-users@xxxxxxxxxxxxxx>
> Sent: Friday, 28 November, 2014 9:08:17 AM
> Subject:  Giant + nfs over cephfs hang tasks
>
> Hello guys,
>
> I've got a bunch of hang tasks of the nfsd service running over the cephfs
> (kernel) mounted file system. Here is an example of one of them.
>
> [433079.991218] INFO: task nfsd:32625 blocked for more than 120 seconds.
> [433080.029685]       Not tainted 3.15.10-031510-generic #201408132333
> [433080.068036] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [433080.144235] nfsd            D 000000000000000a     0 32625      2
> 0x00000000
> [433080.144241]  ffff8801a94dba78 0000000000000002 ffff8801a94dba38
> ffff8801a94dbfd8
> [433080.144244]  0000000000014540 0000000000014540 ffff880673d63260
> ffff880491d264c0
> [433080.144247]  ffff8801a94dba78 ffff88067fd14e40 ffff880491d264c0
> ffffffff8115dff0
> [433080.144250] Call Trace:
> [433080.144260]  [<ffffffff8115dff0>] ? __lock_page+0x70/0x70
> [433080.144274]  [<ffffffff81778449>] schedule+0x29/0x70
> [433080.144279]  [<ffffffff8177851f>] io_schedule+0x8f/0xd0
> [433080.144282]  [<ffffffff8115dffe>] sleep_on_page+0xe/0x20
> [433080.144286]  [<ffffffff81778be2>] __wait_on_bit+0x62/0x90
> [433080.144288]  [<ffffffff8115eacb>] ? find_get_pages_tag+0xcb/0x170
> [433080.144291]  [<ffffffff8115e160>] wait_on_page_bit+0x80/0x90
> [433080.144296]  [<ffffffff810b54a0>] ? wake_atomic_t_function+0x40/0x40
> [433080.144299]  [<ffffffff8115e334>] filemap_fdatawait_range+0xf4/0x180
> [433080.144302]  [<ffffffff8116027d>] filemap_write_and_wait_range+0x4d/0x80
> [433080.144315]  [<ffffffffa06bf1b8>] ceph_fsync+0x58/0x200 [ceph]
> [433080.144330]  [<ffffffff813308f5>] ? ima_file_check+0x35/0x40
> [433080.144337]  [<ffffffff812028c8>] vfs_fsync_range+0x18/0x30
> [433080.144352]  [<ffffffffa03ee491>] nfsd_commit+0xb1/0xd0 [nfsd]
> [433080.144363]  [<ffffffffa03fb787>] nfsd4_commit+0x57/0x60 [nfsd]
> [433080.144370]  [<ffffffffa03fcf9e>] nfsd4_proc_compound+0x54e/0x740 [nfsd]
> [433080.144377]  [<ffffffffa03e8e05>] nfsd_dispatch+0xe5/0x230 [nfsd]
> [433080.144401]  [<ffffffffa03205a5>] svc_process_common+0x345/0x680
> [sunrpc]
> [433080.144413]  [<ffffffffa0320c33>] svc_process+0x103/0x160 [sunrpc]
> [433080.144418]  [<ffffffffa03e895f>] nfsd+0xbf/0x130 [nfsd]
> [433080.144424]  [<ffffffffa03e88a0>] ? nfsd_destroy+0x80/0x80 [nfsd]
> [433080.144428]  [<ffffffff81091439>] kthread+0xc9/0xe0
> [433080.144431]  [<ffffffff81091370>] ? flush_kthread_worker+0xb0/0xb0
> [433080.144434]  [<ffffffff8178567c>] ret_from_fork+0x7c/0xb0
> [433080.144437]  [<ffffffff81091370>] ? flush_kthread_worker+0xb0/0xb0
>
>
>
> I am using Ubuntu 12.04 servers with 3.15.10 kernel and ceph Giant.

Are you seeing any "socket closed" from libceph in dmesg between after
you start dds and before they finish?

Thanks,

                Ilya
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com