Re: xfs_iext_realloc_indirect and "XFS: possible memory allocation deadlock"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 23, 2015 at 10:18:21AM +0200, Alex Lyakas wrote:
> Greetings,
> 
> We are hitting an issue with XFS printing messages like “XFS:
> possible memory allocation deadlock in kmem_alloc
> (mode:0x250)” and stack trace like in [1]. Eventually,
> hung-task panic kicks in with stack traces like [2].
> 
> We are running kernel 3.8.13. I see that in
> http://oss.sgi.com/archives/xfs/2012-01/msg00341.html a similar
> issue has been discussed, but no code changes followed comparing
> to what we have in 3.8.13.
> 
> Any suggestion on how to move forward with this problem? For
> example, does this memory has to be really allocated with kmalloc
> (i.e., physically continuous) or vmalloc can be used?

We left it alone because it is relatively rare for people to hit it,
and generally it indicates a severe fragmentation problem when they
do hit it (i.e. a file with millions of extents in it). Can you
track down the file that this is occurring against and see how many
extents it has?

i.e. you may be much better off by taking measures to avoid excessive
fragmentation than removing the canary from the mine...

> [109626.075483] nfsd            D 0000000000000002     0 20042      2 0x00000000

Hmmm - it's also a file written by the NFS server - this is is on an
a dedicated NFS server?

> [109626.075483]  [<ffffffffa01be6b0>] ? nfsd_setuser+0x120/0x2b0 [nfsd]
> [109626.075483]  [<ffffffff8119bccc>] ? vfs_writev+0x3c/0x50
> [109626.075483]  [<ffffffffa01b7dd2>] ? nfsd_vfs_write.isra.12+0x92/0x350 [nfsd]
> [109626.075483]  [<ffffffff8119a6cb>] ? dentry_open+0x6b/0xd0
> [109626.075483]  [<ffffffffa01ba679>] ? nfsd_write+0xf9/0x110 [nfsd]
> [109626.075483]  [<ffffffffa01c4dd1>] ? nfsd3_proc_write+0xb1/0x140 [nfsd]

Interesting that this is an NFSv3 write...

> [87303.976119] INFO: task nfsd:5684 blocked for more than 180 seconds.
> [87303.976976] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [87303.978012] nfsd            D 0000000000000003     0  5684      2 0x00000000
....
> [87303.978174]  [<ffffffffa0269623>] nfsd_write+0xa3/0x110 [nfsd]
> [87303.978182]  [<ffffffffa027794c>] nfsd4_write+0x1cc/0x250 [nfsd]
> [87303.978189]  [<ffffffffa027746c>] nfsd4_proc_compound+0x5ac/0x7a0 [nfsd]

And that is a NFsv4 write. You have multiple clients writing to the
same file using different versions of the NFS protocol?

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs




[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux