[Cc'ed lkml, hence the full-quote] On Fri, 12 Oct 2012 at 08:33, Dave Chinner wrote: > On Thu, Oct 11, 2012 at 11:13:14AM -0700, Christian Kujau wrote: > > Hi, > > > > since Linux 3.5 I'm seeing these "inconsistent lock state" lockdep > > warnings [0]. They show up in 3.6 as well [1]. I was being told[2] that I > > may have run out of inode attributes. This may well be the case, but I > > cannot reformat the disk right now and will have to live with that warning > > a while longer. > > > > I got the warning again today, but 8h later the system hung and eventually > > shutdown. The last message from the box was received via netconsole: > > > > XFS: possible memory allocation deadlock in xfs_buf_allocate_memory (mode:0x250) > > > > This has been reported[3] for 3.2.1, but when the message was printed I > > was not around and could not watch /proc/slabinfo. > > > > The last -MARK- message (some kind of heartbeat message from syslog, > > printing "MARK" every 5min) has been received 07:55 local time, the > > netconsole message above was received 08:09, so two -MARK- messages were > > lost. sar(1) stopped recording at 08:05. > > > > These two incidents (the lockdep warning and the final lockup) may be > > unrelated (and timestamp-wise, they seem to be), but I thought I'd > > better report it. > > > > Full dmesg and .config: http://nerdbynature.de/bits/3.6.0/xfs/o > > The inconsistent lock state is this path: > > Oct 11 00:18:27 alice kernel: [261506.767190] [e5e2fc70] [c00b5d44] vm_map_ram+0x228/0x5a0 > Oct 11 00:18:27 alice kernel: [261506.768354] [e5e2fcf0] [c01a6f9c] _xfs_buf_map_pages+0x44/0x104 > Oct 11 00:18:27 alice kernel: [261506.769522] [e5e2fd10] [c01a8090] xfs_buf_get_map+0x74/0x11c > Oct 11 00:18:27 alice kernel: [261506.770698] [e5e2fd30] [c01ffad8] xfs_trans_get_buf_map+0xc0/0xdc > Oct 11 00:18:27 alice kernel: [261506.772106] [e5e2fd50] [c01e9504] xfs_ifree_cluster+0x3f4/0x5b0 > Oct 11 00:18:27 alice kernel: [261506.773309] [e5e2fde0] [c01eabc0] xfs_ifree+0xec/0xf0 > Oct 11 00:18:27 alice kernel: [261506.774493] [e5e2fe20] [c01bd5c0] xfs_inactive+0x274/0x448 > Oct 11 00:18:27 alice kernel: [261506.775672] [e5e2fe60] [c01b8450] xfs_fs_evict_inode+0x60/0x74 > Oct 11 00:18:27 alice kernel: [261506.776863] [e5e2fe70] [c00dfa88] evict+0xc0/0x1b4 > Oct 11 00:18:27 alice kernel: [261506.778046] [e5e2fe90] [c00db968] d_delete+0x1b0/0x1ec > Oct 11 00:18:27 alice kernel: [261506.779235] [e5e2feb0] [c00d2e24] vfs_rmdir+0x124/0x128 > Oct 11 00:18:27 alice kernel: [261506.780425] [e5e2fed0] [c00d2f10] do_rmdir+0xe8/0x110 > Oct 11 00:18:27 alice kernel: [261506.781619] [e5e2ff40] [c0010aac] ret_from_syscall+0x0/0x38 > > > Which indicates that it is this: > > > [1] http://oss.sgi.com/archives/xfs/2012-09/msg00305.html > > which is a real bug in the VM code that the VM developers refuse to > fix even though it is simple to do and patches have been posted > several times to fix it. > > /me points to his recent rant rather than repeating it: > > https://lkml.org/lkml/2012/6/13/628 Read through it, only understood the half of it. But interesting to see that there seems to be a real issue. > However, that is unrelated to this message: > > XFS: possible memory allocation deadlock in xfs_buf_allocate_memory (mode:0x250) > > which triggers when memory cannot be allocated. mode 0x250 is > > ___GFP_NOWARN | ___GFP_IO | ___GFP_WAIT > > or more commonly known as: > > GFP_NOFS > > with warnings turned off. Basically the warning is saying "we're > trying really hard to allocate memory, but we're not making > progress". If it was only emitted once, however, it means that > progress was made, as the message is emitted every 100 times through > the loop and so only one message means it looped less than 200 > times... Memory usage at that time was not different than on other days, so I don't know why it had a hard time allocating memory. But I don't have any numbers. > What it does imply, however, is that vm_map_ram() is being called > from GFP_NOFS context quite regularly and might be blocking there, > and so the lockdep warning is more than just a nuisance - your > system may indeed have hung there, but I don't have enough > information to say for sure. > > When it hangs next time, can you get a blocked task dump from sysrq > (i.e. sysrq-w, or "echo w > /proc/sysrq-trigger")? That's the only OK, will try to get this information the next time this happens. > way we're going to know where the system hung. You might also want > to ensure the hung task functionality is built into you kernel, so > it automatically dumps stack traces for tasks hung longer than > 120s... Yes, the option was already set: CONFIG_DETECT_HUNG_TASK=y CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120 # CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=0 Thanks for digging through this and for the explanation! Christian. -- BOFH excuse #398: Data for intranet got routed through the extranet and landed on the internet. _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs