On Mon, 16 Jul 2012, Michal Hocko wrote: > On Mon 16-07-12 01:35:34, Hugh Dickins wrote: > > But even so, the test still OOMs sometimes: when originally testing > > on 3.5-rc6, it OOMed about one time in five or ten; when testing > > just now on 3.5-rc6-mm1, it OOMed on the first iteration. > > > > This residual problem comes from an accumulation of pages under > > ordinary writeback, not marked PageReclaim, so rightly not causing > > the memcg check to wait on their writeback: these too can prevent > > shrink_page_list() from freeing any pages, so many times that memcg > > reclaim fails and OOMs. > > I guess you managed to trigger this with 20M limit, right? That's right. > I have tested > with different group sizes but the writeback didn't trigger for most of > them and all the dirty data were flushed from the reclaim. I didn't examine writeback stats to confirm, but I guess that just occasionally it managed to come in and do enough work to confound us. > Have you used any special setting the dirty ratio? No, I wasn't imaginative enough to try that. > Or was it with xfs (IIUC that one > does ignore writeback from the direct reclaim completely). No, just ext4 at that point. I have since tested the final patch with ext4, ext3 (by ext3 driver and by ext4 driver), ext2 (by ext2 driver and by ext4 driver), xfs, btrfs, vfat, tmpfs (with swap on the USB stick) and block device: about an hour on each, no surprises, all okay. But I didn't experiment beyond the 20M memcg. Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>