On Tue, 29 Jun 2010 12:34:46 +0100 Mel Gorman <mel@xxxxxxxxx> wrote: > When memory is under enough pressure, a process may enter direct > reclaim to free pages in the same manner kswapd does. If a dirty page is > encountered during the scan, this page is written to backing storage using > mapping->writepage. This can result in very deep call stacks, particularly > if the target storage or filesystem are complex. It has already been observed > on XFS that the stack overflows but the problem is not XFS-specific. > > This patch prevents direct reclaim writing back pages by not setting > may_writepage in scan_control. Instead, dirty pages are placed back on the > LRU lists for either background writing by the BDI threads or kswapd. If > in direct lumpy reclaim and dirty pages are encountered, the process will > stall for the background flusher before trying to reclaim the pages again. > > Memory control groups do not have a kswapd-like thread nor do pages get > direct reclaimed from the page allocator. Instead, memory control group > pages are reclaimed when the quota is being exceeded or the group is being > shrunk. As it is not expected that the entry points into page reclaim are > deep call chains memcg is still allowed to writeback dirty pages. I already had "[PATCH 01/14] vmscan: Fix mapping use after free" and I'll send that in for 2.6.35. I grabbed [02/14] up to [11/14]. Including "[PATCH 06/14] vmscan: kill prev_priority completely", grumpyouallsuck. I wimped out at this, "Do not writeback pages in direct reclaim". It really is a profound change and needs a bit more thought, discussion and if possible testing which is designed to explore possible pathologies. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html