On Tue, Nov 19, 2019 at 10:49:56AM -0500, Andrew Carr wrote: > Dave / Eric / Others, > > Syslog: https://pastebin.com/QYQYpPFY > > Dmesg: https://pastebin.com/MdBCPmp9 which shows no stack traces, again. Anyway, you've twiddled mkfs knobs on these filesystems, and that is the likely cause of the issue: the filesystem is using 64k directory blocks - the allocation size is larger than 64kB: [Sun Nov 17 21:40:05 2019] XFS: nginx(31293) possible memory allocation deadlock size 65728 in kmem_alloc (mode:0x250) Upstream fixed this some time ago: $ ▶ gl -n 1 -p cb0a8d23024e commit cb0a8d23024e7bd234dea4d0fc5c4902a8dda766 Author: Dave Chinner <dchinner@xxxxxxxxxx> Date: Tue Mar 6 17:03:28 2018 -0800 xfs: fall back to vmalloc when allocation log vector buffers When using large directory blocks, we regularly see memory allocations of >64k being made for the shadow log vector buffer. When we are under memory pressure, kmalloc() may not be able to find contiguous memory chunks large enough to satisfy these allocations easily, and if memory is fragmented we can potentially stall here. TO avoid this problem, switch the log vector buffer allocation to use kmem_alloc_large(). This will allow failed allocations to fall back to vmalloc and so remove the dependency on large contiguous regions of memory being available. This should prevent slowdowns and potential stalls when memory is low and/or fragmented. Signed-Off-By: Dave Chinner <dchinner@xxxxxxxxxx> Reviewed-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx