Status and dput() crash.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've actually been working pretty hard with your VFS fixes despite my
silence.  Most of my time has been spent just trying to get our local
load tests working with a vanilla 2.6.35 kernel; I finally got that
going this week and started taking some data.  Unfortunately when I
booted your kernel in our environment, things fell over.

First off, I'm using a very-slightly-modified version of your tree, on
the vfs-scale branch.  The only changes are a SATA device driver patch
and a very minor local hack to add an extra directory to /proc, that's
it.

In our environment we make very heavy use of cgroups/cpusets and in fact
the first thing that happens is that a daemon runs that configures
things properly.  I'm not completely familiar with the daemon itself but
suffice to say that while it works fine on the base 2.6.35 kernel, it
crashes on the vfs-scale kernel.

I've appended the crash info (plus some extra info I added just before
the crash) below, but the short version is that it's crashing in dput(),
called from cgroup_clear_directory().  The actual crash is a NULL
pointer dereference at the call to dentry->d_op->d_delete().  It turns
out that, as you can see below, DCACHE_OP_DELETE is set in d_flags but
d_op->d_delete is NULL.  From a scan of the code that's a "can't happen"
but I imagine there is a race here.  I also note that the UNHASHED flag
is set, making me think that maybe we found our way onto a node that
just got created?

I'm going to keep digging into this but I thought I would send you a
heads-up in advance.  And just maybe you'll figure it out before I
do. :-)



Starting cpuset:  [   28.029639] dentry->d_op->d_delete is NULL (dentry ffff880c7b4a2e10 d_op ffffffff81612260)!
[   28.038007]  flags 10010, inode ffff880c7e285280, iname cpuset.memory_spread_slab, name cpuset.memory_spread_slab
[   28.048277] BUG: unable to handle kernel NULL pointer dereference at (null)
[   28.049260] IP: [<(null)>] (null)
[   28.049260] PGD 107c412067 PUD 107cf4f067 PMD 0
[   28.049260] Oops: 0010 [#1] SMP DEBUG_PAGEALLOC
[   28.049260] last sysfs file: /sys/devices/system/node/node3/distance
[   28.049260] CPU 8 
[   28.049260] Modules linked in: sata_mv freq_table processor mperf msr cpuid genrtc
[   28.049260] 
[   28.049260] Pid: 5042, comm: cpuset-daemon Not tainted 2.6.35-dbg #10
[   28.049260] RIP: 0010:[<0000000000000000>]  [<(null)>] (null)
[   28.049260] RSP: 0018:ffff880c7d885d40  EFLAGS: 00210292
[   28.049260] RAX: ffffffff81612260 RBX: ffff880c7b4a2e10 RCX: 0000000000000030
[   28.049260] RDX: 0000000000000011 RSI: 0000000000000001 RDI: ffff880c7b4a2e10
[   28.049260] RBP: ffff880c7d885d98 R08: 0000000000200086 R09: 0000000000000000
[   28.049260] R10: ffffffff81a344b8 R11: ffffffff810849bc R12: ffffea0000000000
[   28.049260] R13: ffff880c7b4a2e70 R14: eeeeeeeeeeeeeeef R15: ffff880c7b4a2ed0
[   28.049260] FS:  0000000000000000(0000) GS:ffff88088e200000(0063) knlGS:00000000f74cda20
[   28.049260] CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
[   28.049260] CR2: 0000000000000000 CR3: 000000107c57f000 CR4: 00000000000006e0
[   28.049260] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   28.049260] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   28.049260] Process <cpuset-daemon> (pid: 5042, threadinfo ffff880c7d884000, task ffff880c7ba65410)
[   28.049260] Stack:
[   28.049260]  ffffffff81127337 ffff880c7e285280 ffff880c7b4a2e10 ffff880c7b4f80d0
[   28.049260] <0> ffff880c7b4a2ed0 ffff880c7d885d98 ffff880c7b4a2e10 ffff880c7b4f8000
[   28.049260] <0> ffff880c7b4f8060 ffff880c7b4f80d0 ffff880c7b4a2ed0 ffff880c7d885de8
[   28.049260] Call Trace:
[   28.049260]  [<ffffffff81127337>] ? dput+0xcb/0x3d3
[   28.049260]  [<ffffffff810a3152>] cgroup_clear_directory+0xca/0x104
[   28.049260]  [<ffffffff810a346c>] cgroup_rmdir+0x2e0/0x429
[   28.049260]  [<ffffffff810806da>] ? autoremove_wake_function+0x0/0x39
[   28.049260]  [<ffffffff8111f3d8>] vfs_rmdir+0x7e/0xc2
[   28.049260]  [<ffffffff81120aff>] do_rmdir+0xb7/0x106
[   28.049260]  [<ffffffff810907a3>] ? trace_hardirqs_on_caller+0x10c/0x130
[   28.049260]  [<ffffffff8145dec6>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[   28.049260]  [<ffffffff810f1202>] ? spin_lock+0xe/0x10
[   28.049260]  [<ffffffff81120b8f>] sys_rmdir+0x16/0x18
[   28.049260]  [<ffffffff81055b78>] cstar_dispatch+0x7/0x2c

-- 
Frank Mayhar <fmayhar@xxxxxxxxxx>
Google Inc.

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux