Re: [PATCH -v3] ext4: don't BUG if kernel subsystems dirty pages without asking ext4 first

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/25/22 13:23, Theodore Ts'o wrote:
[un]pin_user_pages_remote is dirtying pages without properly warning
the file system in advance.  This was noted by Jan Kara in 2018[1] and

In 2018, [un]pin_user_pages_remote did not exist. And so what Jan reported
was actually that dio_bio_complete() was calling set_page_dirty_lock()
on pages that were not (any longer) set up for that.

more recently has resulted in bug reports by Syzbot in various Android
kernels[2].

This is technically a bug in mm/gup.c, but arguably ext4 is fragile in

Is it, really? unpin_user_pages_dirty_lock() moved the set_page_dirty_lock()
call into mm/gup.c, but that merely refactored things. The callers are
all over the kernel, and those callers are what need changing in order
to fix this.


thanks,
--
John Hubbard
NVIDIA

that a buggy kernel subsystem which dirty pages without properly
notifying the file system causes ext4 to BUG, while other file systems
are not (although user data likely will be lost).  I suspect in real
life it is rare that people are using RDMA into file-backed memory,
which is why no one has complained to ext4 developers except fuzzing
programs.

So instead of crashing with a BUG, issue a warning (since there may be
potential data loss) and just mark the page as clean to avoid
unprivileged denial of service attacks until the problem can be
properly fixed.  More discussion and background can be found in the
thread starting at [2].

[1] https://lore.kernel.org/linux-mm/20180103100430.GE4911@xxxxxxxxxxxxxx
[2] https://lore.kernel.org/r/Yg0m6IjcNmfaSokM@xxxxxxxxxx

Reported-by: syzbot+d59332e2db681cf18f0318a06e994ebbb529a8db@xxxxxxxxxxxxxxxxxxxxxxxxx
Reported-by: Lee Jones <lee.jones@xxxxxxxxxx>
Signed-off-by: Theodore Ts'o <tytso@xxxxxxx>
---
  fs/ext4/inode.c | 27 ++++++++++++++++++++++++++-
  1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 01c9e4f743ba..008fe8750109 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1993,6 +1993,15 @@ static int ext4_writepage(struct page *page,
  	else
  		len = PAGE_SIZE;
+ /* Should never happen but for bugs in other kernel subsystems */
+	if (!page_has_buffers(page)) {
+		ext4_warning_inode(inode,
+		   "page %lu does not have buffers attached", page->index);
+		ClearPageDirty(page);
+		unlock_page(page);
+		return 0;
+	}
+
  	page_bufs = page_buffers(page);
  	/*
  	 * We cannot do block allocation or other extent handling in this
@@ -2588,12 +2597,28 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
  			     (mpd->wbc->sync_mode == WB_SYNC_NONE)) ||
  			    unlikely(page->mapping != mapping)) {
  				unlock_page(page);
-				continue;
+				goto out;
  			}
wait_on_page_writeback(page);
  			BUG_ON(PageWriteback(page));
+ /*
+			 * Should never happen but for buggy code in
+			 * other subsystems that call
+			 * set_page_dirty() without properly warning
+			 * the file system first.  See [1] for more
+			 * information.
+			 *
+			 * [1] https://lore.kernel.org/linux-mm/20180103100430.GE4911@xxxxxxxxxxxxxx
+			 */
+			if (!page_has_buffers(page)) {
+				ext4_warning_inode(mpd->inode, "page %lu does not have buffers attached", page->index);
+				ClearPageDirty(page);
+				unlock_page(page);
+				continue;
+			}
+
  			if (mpd->map.m_len == 0)
  				mpd->first_page = page->index;
  			mpd->next_page = page->index + 1;





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux