Re: [BUG] arm64: an infinite loop in generic_perform_write()

Catalin Marinas <catalin.marinas@xxxxxxx> · Thu, 24 Jun 2021 16:09:11 +0100

On Thu, Jun 24, 2021 at 12:15:46PM +0100, Matthew Wilcox wrote:
> On Thu, Jun 24, 2021 at 08:04:07AM +0100, Christoph Hellwig wrote:
> > On Thu, Jun 24, 2021 at 04:24:46AM +0100, Matthew Wilcox wrote:
> > > On Thu, Jun 24, 2021 at 11:10:41AM +0800, Chen Huang wrote:
> > > > In userspace, I perform such operation:
> > > > 
> > > >  	fd = open("/tmp/test", O_RDWR | O_SYNC);
> > > >         access_address = (char *)mmap(NULL, uio_size, PROT_READ, MAP_SHARED, uio_fd, 0);
> > > >         ret = write(fd, access_address + 2, sizeof(long));
> > > 
> > > ... you know that accessing this at unaligned offsets isn't going to
> > > work.  It's completely meaningless.  Why are you trying to do it?
> > 
> > We still should not cause an infinite loop in kernel space due to a
> > a userspace programmer error.
> 
> They're running as root and they've mapped some device memory.  We can't
> save them from themself.  Imagine if they'd done this to the NVMe BAR.

Ignoring the MMIO case for now, I can trigger the same infinite loop
with MTE (memory tagging), something like:

	char *a;

	a = mmap(0, page_sz, PROT_READ | PROT_WRITE | PROT_MTE,
		 MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
	/* tag 0 is the default, set tag 1 for the next 16 bytes */
	set_tag((unsigned long)(a + 16) | (1UL << 56));

	/* uaccess to a[16] expected to fail */
	bytes = write(fd, a + 14, 8);

The iov_iter_fault_in_readable() check succeeds since a[14] has tag 0.
However, the copy_from_user() attempts an unaligned 8-byte load which
fails because of the mismatched tag from a[16]. The loop continues
indefinitely.

copy_from_user() is not required to squeeze in as much as possible. So I
think the 1-byte read per page via iov_iter_fault_in_readable() is not
sufficient to guarantee progress unless copy_from_user() also reads at
least 1 byte.

We could change raw_copy_from_user() to fall back to 1-byte read in case
of a fault or fix this corner case in the generic code. A quick hack,
re-attempting the access with one byte:

------------------8<-------------------------

diff --git a/mm/filemap.c b/mm/filemap.c
index 66f7e9fdfbc4..67059071460c 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3686,8 +3686,18 @@ ssize_t generic_perform_write(struct file *file,
 			 * because not all segments in the iov can be copied at
 			 * once without a pagefault.
 			 */
-			bytes = min_t(unsigned long, PAGE_SIZE - offset,
-						iov_iter_single_seg_count(i));
+			unsigned long single_seg_bytes =
+				min_t(unsigned long, PAGE_SIZE - offset,
+				      iov_iter_single_seg_count(i));
+
+			/*
+			 * Check for intra-page faults (arm64 MTE, SPARC ADI)
+			 * and fall back to single byte.
+			 */
+			if (bytes > single_seg_bytes)
+				bytes = single_seg_bytes;
+			else
+				bytes = 1;
 			goto again;
 		}
 		pos += copied;
------------------8<-------------------------

Or a slightly different hack, trying to detect if the first segment was
crossing a page boundary:

------------------8<-------------------------
diff --git a/mm/filemap.c b/mm/filemap.c
index 66f7e9fdfbc4..7d1c03f5f559 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3678,16 +3678,24 @@ ssize_t generic_perform_write(struct file *file,
 
 		iov_iter_advance(i, copied);
 		if (unlikely(copied == 0)) {
+			struct iovec v = iov_iter_iovec(i);
+
 			/*
 			 * If we were unable to copy any data at all, we must
-			 * fall back to a single segment length write.
+			 * fall back to a single segment length write or a
+			 * single byte write (for intra-page faults - arm64
+			 * MTE or SPARC ADI).
 			 *
 			 * If we didn't fallback here, we could livelock
-			 * because not all segments in the iov can be copied at
-			 * once without a pagefault.
+			 * because not all segments in the iov or data within
+			 * a segment can be copied at once without a fault.
 			 */
-			bytes = min_t(unsigned long, PAGE_SIZE - offset,
-						iov_iter_single_seg_count(i));
+			if (((unsigned long)v.iov_base & PAGE_MASK) ==
+			    ((unsigned long)(v.iov_base + bytes) & PAGE_MASK))
+				bytes = 1;
+			else
+				bytes = min_t(unsigned long, PAGE_SIZE - offset,
+					      iov_iter_single_seg_count(i));
 			goto again;
 		}
 		pos += copied;
------------------8<-------------------------

-- 
Catalin