I thought I'd sent this in December, but I didn't see it in the
archives, so I thought I'd take another crack at it.
The problem seems to be spreading to clients mounting Linux server
hosted NFS exports as well.
Any help would be appreciated.
Thanks,
Brian
----- Forwarded message from Brian Kroth <bpkroth@xxxxxxxxx> -----
Date: Fri, 16 Dec 2011 10:36:45 -0600
From: Brian Kroth <bpkroth@xxxxxxxxx>
To: linux-cachefs@xxxxxxxxxx
Subject: hang on __fscache_wait_on_page_write and
__nfs_fscache_invalidate_page
Reply-To: Brian Kroth <bpkroth@xxxxxxxxx>
User-Agent: Mutt/1.5.20 (2009-06-14)
X-Operating-System: Linux 2.6.32-bpo.3-amd64 x86_64
Hello all. I hope this is the right place for this question. If not
please let me know where else it should go.
I've got some machines that from time to time will hang waiting on an
umount.nfs command and eventually spit out a call trace referencing
__fscache_wait_on_page_write and __nfs_fscache_invalidate_page.
The situation is as follows:
Machines are configure to nfs automount some set of filesystems. Some
are provided by an EMC Celerra NAS, others by Linux hosts, others by
Solaris.
Most of the mounts are configured to use the fsc option so that data
which is read is cached for faster access the next time around.
On one of these mounts I've seen on a couple client nodes about a half
dozen times now, a situation where
1) The automounter will attempt to unmount a particular volume due to
inactivity. I've used lsof to confirm that nothing was using the
mount and this should in theory be a safe operation to do.
2) The umount.nfs process will block with the following call trace:
[ 722.136057] INFO: task umount.nfs:926 blocked for more than 120 seconds.
[ 722.136061] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 722.136065] umount.nfs D ffff88012bb807a0 0 926 1 0x00000000
[ 722.136071] ffff88012bb807a0 0000000000000082 ffffea0002179998 ffff8800c15ad1c0
[ 722.136077] 0000000000013b40 ffff880016019fd8 ffff880016019fd8 0000000000013b40
[ 722.136083] ffff88012bb807a0 ffff880016018010 0000000000000246 0000000000000001
[ 722.136088] Call Trace:
[ 722.136100] [<ffffffffa0ef8c3a>] ? __fscache_wait_on_page_write+0x83/0x9d [fscache]
[ 722.136106] [<ffffffff8105f3b1>] ? wake_up_bit+0x20/0x20
[ 722.136120] [<ffffffffa0f444c1>] ? __nfs_fscache_invalidate_page+0x45/0x7a [nfs]
[ 722.136126] [<ffffffff810bf556>] ? truncate_inode_page+0x45/0x78
[ 722.136130] [<ffffffff810bfa53>] ? truncate_inode_pages_range+0xca/0x2f2
[ 722.136140] [<ffffffffa0f1f3bd>] ? nfs_access_free_entry+0x15/0x1f [nfs]
[ 722.136150] [<ffffffffa0f21477>] ? nfs_access_zap_cache+0x116/0x12a [nfs]
[ 722.136161] [<ffffffffa0f2380c>] ? nfs_evict_inode+0x12/0x23 [nfs]
[ 722.136166] [<ffffffff8110e5b9>] ? evict+0x79/0x116
[ 722.136169] [<ffffffff8110e682>] ? dispose_list+0x2c/0x36
[ 722.136173] [<ffffffff8110e865>] ? evict_inodes+0xc7/0xea
[ 722.136179] [<ffffffff810fd872>] ? generic_shutdown_super+0x4e/0xd4
[ 722.136183] [<ffffffff810fd960>] ? kill_anon_super+0x9/0x40
[ 722.136194] [<ffffffffa0f27996>] ? nfs_kill_super+0x15/0x28 [nfs]
[ 722.136199] [<ffffffff810fd9d0>] ? deactivate_locked_super+0x1e/0x44
[ 722.136203] [<ffffffff81112c9c>] ? sys_umount+0x2ea/0x315
[ 722.136207] [<ffffffff81112039>] ? mntput_no_expire+0x8f/0x14f
[ 722.136212] [<ffffffff81339392>] ? system_call_fastpath+0x16/0x1b
(surprisingly early after boot don't you think?)
3) At some point later, the volume will attempt to be accessed again,
causing automount to spawn a mount process, which will hang since
there's already a umount process running.
Since the processes are blocked in the kernel. There's not much I can
do to resolve the issue but kick the machine.
The clients (~200 of them) are all running debian squeeze based
machines with a 2.6.39 kernel (from debian-backports). They run
cachefilesd 0.9-3 with the following conf:
dir /var/cache/fscache
tag mycache
brun 15%
bcull 12%
bstop 10%
frun 15%
fcull 12%
fstop 10%
culltable 16
The backing store for the cache is mounted as follows:
# mount | grep fscache
/dev/mapper/vg-fscache on /var/cache/fscache type ext4 (rw,relatime,user_xattr)
The automount that seems to be having the problems looks like this:
cnerg -hard,vers=3,nosuid,fsc filespace:/groups_cnerg
It's served by the EMC Celerra. However, none of the other 35
automounts that that server serves, nor any of the ones that are
served by the Linux/Solaris hosts, seem to have exhibited this
problem. Not necessarily a causal relationship, but suspicious. It
might just be the way the group is using the volume, though I haven't
been able to pinpoint anything along those lines either.
More telling, I think, is that since I've removed the fsc mount option
from that automount I haven't seen this problem again. However, I
would like to keep that option on.
The trouble in debugging is that I don't know how to reproduce the
problem. If I did, I'd capture some NFS packet traces to verify
nothing is happening there.
Due to the call trace however, it seems to me like the fscache is
failing for some reason.
Which is what brings me here. So, now I'm wondering if anyone has an
insight as to any known problems with a setup like this or other
things I could look into?
Let me know if you need any more details. I have a handful of ps,
lsof, and dmesg dumps I can share.
Thanks,
Brian
----- End forwarded message -----
--
Linux-cachefs mailing list
Linux-cachefs@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cachefs