On Wed 01-07-15 08:58:51, Dave Chinner wrote: [...] > *blink* > > /me re-reads again > > That assumption is fundamentally broken. Filesystems use GFP_NOFS > because the filesystem holds resources that can prevent memory > reclaim making forwards progress if it re-enters the filesystem or > blocks on anything filesystem related. memcg does not change that, > and I'm kinda scared to learn that memcg plays fast and loose like > this. > > For example: IO completion might require unwritten extent conversion > which executes filesystem transactions and GFP_NOFS allocations. The > writeback flag on the pages can not be cleared until unwritten > extent conversion completes. Hence memory reclaim cannot wait on > page writeback to complete in GFP_NOFS context because it is not > safe to do so, memcg reclaim or otherwise. Thanks for the clarification. > > really charge after set_page_writeback (called from ext4_bio_write_page) > > and before the page is really submitted (when the bio is full or > > explicitly via ext4_io_submit). I thought that io_submit_add_bh submits > > the page but it doesn't do that necessarily. > > XFS does exactly the same thing - the underlying alogrithm ext4 uses > to build large bios efficiently was copied from XFS. And FWIW XFS has > been using this algorithm since 2.6.15.... OK, I will mark the patch for stable then. Thanks! -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html