[add jan kara to cc] On Mon, Sep 04, 2017 at 11:43:53AM +1000, Dave Chinner wrote: > On Sun, Sep 03, 2017 at 12:43:06AM -0700, Christoph Hellwig wrote: > > On Sun, Sep 03, 2017 at 09:22:17AM +0500, Михаил Гаврилов wrote: > > > [281502.961248] ------------[ cut here ]------------ > > > [281502.961257] kernel BUG at fs/xfs/xfs_aops.c:853! > > > > This is: > > > > bh = head = page_buffers(page); > > > > Which looks odd and like some sort of VM/writeback change might > > have triggered that we get a page without buffers, despite always > > creating buffers in iomap_begin/end and page_mkwrite. > > Pretty sure this can still happen when buffer_heads_over_limit comes > true. In that case, shrink_active_list() will attempt to strip > the bufferheads off the page even if it's a dirty page. i.e. this > code: > > if (unlikely(buffer_heads_over_limit)) { > if (page_has_private(page) && trylock_page(page)) { > if (page_has_private(page)) > try_to_release_page(page, 0); > unlock_page(page); > } > } > > > There was some discussion about this a while back, the consensus was > that it is a mm bug, but nobody wanted to add a PageDirty check > to try_to_release_page() and so nothing ended up being done about > it in the mm/ subsystem. Instead, filesystems needed to avoid it > if it was a problem for them. Indeed, we fixed it in the filesystem > in 4.8: > > 99579ccec4e2 xfs: skip dirty pages in ->releasepage() > > diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c > index 3ba0809e0be8..6135787500fc 100644 > --- a/fs/xfs/xfs_aops.c > +++ b/fs/xfs/xfs_aops.c > @@ -1040,6 +1040,20 @@ xfs_vm_releasepage( > > trace_xfs_releasepage(page->mapping->host, page, 0, 0); > > + /* > + * mm accommodates an old ext3 case where clean pages might not have had > + * the dirty bit cleared. Thus, it can send actual dirty pages to > + * ->releasepage() via shrink_active_list(). Conversely, > + * block_invalidatepage() can send pages that are still marked dirty > + * but otherwise have invalidated buffers. > + * > + * We've historically freed buffers on the latter. Instead, quietly > + * filter out all dirty pages to avoid spurious buffer state warnings. > + * This can likely be removed once shrink_active_list() is fixed. > + */ > + if (PageDirty(page)) > + return 0; > + > xfs_count_page_state(page, &delalloc, &unwritten); > > But looking at the current code, the comment is still mostly there > but the PageDirty() check isn't. > > <sigh> > > In 4.10, this was done: > > commit 0a417b8dc1f10b03e8f558b8a831f07ec4c23795 > Author: Jan Kara <jack@xxxxxxx> > Date: Wed Jan 11 10:20:04 2017 -0800 > > xfs: Timely free truncated dirty pages > > Commit 99579ccec4e2 "xfs: skip dirty pages in ->releasepage()" started > to skip dirty pages in xfs_vm_releasepage() which also has the effect > that if a dirty page is truncated, it does not get freed by > block_invalidatepage() and is lingering in LRU list waiting for reclaim. > So a simple loop like: > > while true; do > dd if=/dev/zero of=file bs=1M count=100 > rm file > done > > will keep using more and more memory until we hit low watermarks and > start pagecache reclaim which will eventually reclaim also the truncate > pages. Keeping these truncated (and thus never usable) pages in memory > is just a waste of memory, is unnecessarily stressing page cache > reclaim, and reportedly also leads to anonymous mmap(2) returning ENOMEM > prematurely. > > So instead of just skipping dirty pages in xfs_vm_releasepage(), return > to old behavior of skipping them only if they have delalloc or unwritten > buffers and fix the spurious warnings by warning only if the page is > clean. > > CC: stable@xxxxxxxxxxxxxxx > CC: Brian Foster <bfoster@xxxxxxxxxx> > CC: Vlastimil Babka <vbabka@xxxxxxx> > Reported-by: Petr T�ma <petr.tuma@xxxxxxxxxxxxxxx> > Fixes: 99579ccec4e271c3d4d4e7c946058766812afdab > Signed-off-by: Jan Kara <jack@xxxxxxx> > Reviewed-by: Brian Foster <bfoster@xxxxxxxxxx> > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > > So, yeah, we reverted the fix for a crash rather than trying to fix > the adverse behaviour caused by invalidation of a dirty page. > > e.g. why didn't we simply clear the PageDirty flag in > xfs_vm_invalidatepage()? The page is being invalidated - it's > contents will never get written back - so having delalloc or > unwritten extents over that page at the time it is invalidated is a > bug and the original fix would have triggered warnings about > this.... Seems like a reasonable revert/change, but given that ext3 was killed off long ago, is it even still the case that the mm can feed releasepage a dirty clean page? If that is the case, then isn't it time to fix the mm too? --D > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>