Re: LTP regressions due to 6dc296e7df4c ("mm: make sure all file VMAs have ->vm_ops set")

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/15, Kirill A. Shutemov wrote:
>
> On Tue, Sep 15, 2015 at 02:12:01PM +0200, Oleg Nesterov wrote:
> > On 09/14, Kirill A. Shutemov wrote:
> > >
> > > On Mon, Sep 14, 2015 at 07:05:47PM +0200, Oleg Nesterov wrote:
> > > > On 09/14, Kirill A. Shutemov wrote:
> > > > >
> > > > > Fix is below. I don't really like it, but I cannot find any better
> > > > > solution.
> > > >
> > > > Me too...
> > > >
> > > > But this change "documents" the nasty special "vm_file && !vm_ops" case, and
> > > > I am not sure how we can remove it later...
> > > >
> > > > So perhaps we should change vma_is_anonymous() back to check ->fault too,
> > > >
> > > > 	 static inline bool vma_is_anonymous(struct vm_area_struct *vma)
> > > > 	 {
> > > > 	-	return !vma->vm_ops;
> > > > 	+	return !vma->vm_ops || !vma->vm_ops->fault;
> > >
> > > No. This would give a lot false positives from drives which setup page
> > > tables upfront and don't use ->fault at all.
> >
> > And? I mean, I am not sure I understand what exactly do you dislike.
> >
> > Firstly, I still think that (in the long term) we should change them
> > to use .faul = no_fault() which just returns VM_FAULT_SIGBUS.
>
> I would rather like to see consolidated fault path between file and anon
> with ->vm_ops set for both. So vma_is_anonymous() will be trivial
> vma->vm_ops == anon_vm_ops.

I too thought about this. Perhaps but I guess this needs another
discussion.

In particular I am not sure we should just rely on vm_ops == anon_vm_ops.
Again, it is not that I think that the VM_MPX check in arch_vma_name() is
that bad. Still I think it would be better if mpx_mmap() could install
vma->vm_ops = mpx_vm_ops with ->name(). So perhaps ->anon_fault() makes
more sense. But lets not discuss this right now.

>
> > Until then I do not see why the change above can be really bad. The
> > VM_SHARED case is fine, do_anonymous_page() will return VM_FAULT_SIGBUS.
> >
> > So afaics the only problem is that after the change above the private
> > mapping can silently get an anonymous page after (say) MADV_DONTNEED
> > instead of the nice SIGBUS from do_fault(). I agree, this is not good,
> > but see above.
>
> So, what the point to introduce vma_is_anonymous() if it often produces
> false result? vma_is_anonymous_or_maybe_not()?

Heh.

Then what the point to demand that "All file mapping must have ->vm_ops set"
if mmap(MAP_PRIVATE, "/dev/zero") has ->vm_ops == NULL ? Because this is
not actually the file mapping, yes. And this is why we want vma_is_anonymous()
to return T in this case.

vma_is_anonymous() just says that a page fault will use do_anonymous_page().
I agree, it would be nice to ensure vma_is_anonymous() can only be true
if this vma can only have the anon pages. Let me repeat that I suggested
this change as a short-term fix (at least without other changes like we
discuss above). Because the mmap_zero() hack looks worse to me. Damn, even
the ugly hack below looks better to me.

> > Whether we need to keep the vm_ops/fault check in __vma_link_rb() and
> > mmap_region() is another issue. But if we keep them, then I think we
> > should at least turn the !vma->vm_ops check in mmap_region into
> > WARN_ON() as well.
>
> It would require first fix all known cases where ->f_op->mmap() returns
> vma->vm_ops == NULL. Not subject for 4.3, I think.

Kirill, I even sent you the private email to clarify that - of course! -
I only meant "in the longer term" ;)

Oleg.

--- x/include/linux/mm.h
+++ x/include/linux/mm.h
@@ -1289,9 +1289,11 @@ static inline int vma_growsdown(struct v
 	return vma && (vma->vm_end == addr) && (vma->vm_flags & VM_GROWSDOWN);
 }
 
+#define xxx_fault	((void*)1)
+
 static inline bool vma_is_anonymous(struct vm_area_struct *vma)
 {
-	return !vma->vm_ops;
+	return !vma->vm_ops || vma->vm_ops->fault == xxx_fault;
 }
 
 static inline int stack_guard_page_start(struct vm_area_struct *vma,
--- x/drivers/char/mem.c
+++ x/drivers/char/mem.c
@@ -653,11 +653,17 @@ static ssize_t read_iter_zero(struct kio
 
 static int mmap_zero(struct file *file, struct vm_area_struct *vma)
 {
+	static const struct vm_operations_struct xxx_ops = {
+		.fault = xxx_fault,
+	};
+		}
 #ifndef CONFIG_MMU
 	return -ENOSYS;
 #endif
 	if (vma->vm_flags & VM_SHARED)
 		return shmem_zero_setup(vma);
+
+	vma->vm_ops = &xxx_ops;
 	return 0;
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]