Re: Buffered I/O broken on s390x with page faults disabled (gfs2)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 8, 2022 at 9:40 AM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> Hmm. The futex code actually uses "fixup_user_fault()" for this case.
> Maybe fault_in_safe_writeable() should do that?

.. paging all the bits back in, I'm reminded that one of the worries
was "what about large areas".

But I really think that the solution should be that we limit the size
of fault_in_safe_writeable() to just a couple of pages.

Even if you were to fault in gigabytes, page-out can undo it anyway,
so there is no semantic reason why that function should ever do more
than a few pages to make sure. There's already even a comment about
how there's no guarantee that the pages will stay.

Side note: the current GUP-based fault_in_safe_writeable() is buggy in
another way anyway: it doesn't work right for stack extending
accesses.

So I think the fix for this all might be something like the attached
(TOTALLY UNTESTED)!

Comments? Andreas, mind (carefully - maybe it is totally broken and
does unspeakable acts to your pets) testing this?

                         Linus
 mm/gup.c | 40 ++++++++++++----------------------------
 1 file changed, 12 insertions(+), 28 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index a9d4d724aef7..9e085e7b9c28 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1745,44 +1745,28 @@ EXPORT_SYMBOL(fault_in_writeable);
 size_t fault_in_safe_writeable(const char __user *uaddr, size_t size)
 {
 	unsigned long start = (unsigned long)untagged_addr(uaddr);
-	unsigned long end, nstart, nend;
+	unsigned long end, nstart;
 	struct mm_struct *mm = current->mm;
-	struct vm_area_struct *vma = NULL;
-	int locked = 0;
+	const unsigned int fault_flags = FAULT_FLAG_WRITE | FAULT_FLAG_KILLABLE;
+	const size_t max_size = 4 * PAGE_SIZE;
 
 	nstart = start & PAGE_MASK;
-	end = PAGE_ALIGN(start + size);
+	end = PAGE_ALIGN(start + min(size, max_size));
 	if (end < nstart)
 		end = 0;
-	for (; nstart != end; nstart = nend) {
-		unsigned long nr_pages;
-		long ret;
 
-		if (!locked) {
-			locked = 1;
-			mmap_read_lock(mm);
-			vma = find_vma(mm, nstart);
-		} else if (nstart >= vma->vm_end)
-			vma = vma->vm_next;
-		if (!vma || vma->vm_start >= end)
-			break;
-		nend = end ? min(end, vma->vm_end) : vma->vm_end;
-		if (vma->vm_flags & (VM_IO | VM_PFNMAP))
-			continue;
-		if (nstart < vma->vm_start)
-			nstart = vma->vm_start;
-		nr_pages = (nend - nstart) / PAGE_SIZE;
-		ret = __get_user_pages_locked(mm, nstart, nr_pages,
-					      NULL, NULL, &locked,
-					      FOLL_TOUCH | FOLL_WRITE);
-		if (ret <= 0)
+	mmap_read_lock(mm);
+	for (; nstart != end; nstart += PAGE_SIZE) {
+		if (fixup_user_fault(mm, nstart, fault_flags, NULL))
 			break;
-		nend = nstart + ret * PAGE_SIZE;
 	}
-	if (locked)
-		mmap_read_unlock(mm);
+	mmap_read_unlock(mm);
+
+	/* If we got all of our (truncated) fault-in, we claim we got it all */
 	if (nstart == end)
 		return 0;
+
+	/* .. otherwise we'll use the original untruncated size */
 	return size - min_t(size_t, nstart - start, size);
 }
 EXPORT_SYMBOL(fault_in_safe_writeable);

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux