Re: [PATCH 15/33] userfaultfd: hugetlbfs: add __mcopy_atomic_hugetlb for huge page UFFDIO_COPY

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 03, 2016 at 10:33:09AM -0700, Mike Kravetz wrote:
> On 11/03/2016 03:15 AM, Hillf Danton wrote:
> >> +	if (zeropage)
> >> +		return -EINVAL;
> > 
> > Release mmap_sem before return?

This shows we need to extend the selftest to execute UFFDIO_ZEROPAGE
also on tmpfs and hugetlbfs two cases, and verify it returns -EINVAL.

> >> +
> >> +	src_addr = src_start;
> >> +	dst_addr = dst_start;
> >> +	copied = 0;
> >> +	page = NULL;
> >> +	vma_hpagesize = vma_kernel_pagesize(dst_vma);
> >> +
> >> +retry:
> >> +	/*
> >> +	 * On routine entry dst_vma is set.  If we had to drop mmap_sem and
> >> +	 * retry, dst_vma will be set to NULL and we must lookup again.
> >> +	 */
> >> +	err = -EINVAL;
> >> +	if (!dst_vma) {
> >> +		dst_vma = find_vma(dst_mm, dst_start);
> > 
> > In case of retry, s/dst_start/dst_addr/?
> > And check if we find a valid vma?

I don't think that's needed. Yes intuitively if a munmap zaps the
start of the vma during the copy we could continue, but userfaultfd
generally is as strict as it can get.

This is why UFFDIO_COPY is not doing like mremap, that just wipe
whatever existed in destination silently. UFFDIO_COPY returns -EEXIST
whenever something is already mapped there during a UFFDIO_COPY.

When it's userland managing the faults, being more strict I think it's
safer.

Running a copy concurrent with a munmap or any other vma mangling
leads to an undefined result. I think it's preferable to generate an
error to userland if it ever does an undefined operation considering
the risk if something goes wrong here while userland are managing the
faults. Furthermore this keeps the code simpler.

This is also why the revalidation code then does:

		if (dst_start < dst_vma->vm_start ||
		    dst_start + len > dst_vma->vm_end)
			goto out_unlock;

and it pretends the vma it is still there for the whole range being
copied.

So I tend to prefer the current version, letting it succeed silently
while correct and valid in theory, in practice sounds worse than the
current stricter behavior.

In any case if we change this for hugetlbfs, the non-hugetlbfs variant
should also be updated.

> >> @@ -182,6 +355,13 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm,
> >>  		goto out_unlock;
> >>
> >>  	/*
> >> +	 * If this is a HUGETLB vma, pass off to appropriate routine
> >> +	 */
> >> +	if (dst_vma->vm_flags & VM_HUGETLB)
> >> +		return  __mcopy_atomic_hugetlb(dst_mm, dst_vma, dst_start,
> >> +						src_start, len, false);
> > 
> > Use is_vm_hugetlb_page()? 
> > 
> > 
> 
> Thanks Hillf, all valid points.  I will create another version of
> this patch.

Nice cleanup yes.

Thanks!
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]