On Wed, May 28, 2014 at 09:09:23AM -0700, Linus Torvalds wrote: > On Tue, May 27, 2014 at 11:53 PM, Minchan Kim <minchan@xxxxxxxxxx> wrote: > > > > So, my stupid idea is just let's expand stack size and keep an eye > > toward stack consumption on each kernel functions via stacktrace of ftrace. ..... > But what *does* stand out (once again) is that we probably shouldn't > do swap-out in direct reclaim. This came up the last time we had stack > issues (XFS) too. I really do suspect that direct reclaim should only > do the kind of reclaim that does not need any IO at all. > > I think we _do_ generally avoid IO in direct reclaim, but swap is > special. And not for a good reason, afaik. DaveC, remind me, I think > you said something about the swap case the last time this came up.. Right, we do generally avoid IO through filesystems via direct reclaim because delayed allocation requires significant amounts of additional memory, stack space and IO. However, swap doesn't have that overhead - it's just the IO stack that it drives through submit_bio(), and the worst case I'd seen through that path was much less than other reclaim stack path usage. I haven't seen swap in any of the stack overflows from production machines, and I only rarely see it in worst case stack usage profiles on my test machines. Indeed, the call chain reported here is not caused by swap issuing IO. We scheduled in the swap code (throttling waiting for congestion, I think) with a plugged block device (from the ext4 writeback layer) with pending bios queued on it and the scheduler has triggered a flush of the device. submit_bio in the swap path has much less stack usage than io_schedule() because it doesn't have any of the scheduler or plug list flushing overhead in the stack. So, realistically, the swap path is not worst case stack usage here and disabling it won't prevent this stack overflow from happening. Direct reclaim will simply throttle elsewhere and that will still cause the plug to be flushed, the IO to be issued and the stack to overflow. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>