Re: [PATCH] hugetlb: unshare some PMDs when splitting VMAs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> > I'll see if I can confirm that this is indeed possible and send a
> > repro if it is.
>
> I think your analysis above is correct.  The key being the failure to unshare
> in the non-PUD_SIZE vma after the split.

I do indeed hit the WARN_ON_ONCE (repro attached), and the MADV wasn't
even needed (the UFFDIO_REGISTER does the VMA split before "unsharing
all PMDs"). With the fix, we avoid the WARN_ON_ONCE, but the behavior
is still incorrect: I expect the address range to be write-protected,
but it isn't.

The reason why is that hugetlb_change_protection uses huge_pte_offset,
even if it's being called for a UFFDIO_WRITEPROTECT with
UFFDIO_WRITEPROTECT_MODE_WP. In that particular case, I'm pretty sure
we should be using huge_pte_alloc, but even so, it's not trivial to
get an allocation failure back up to userspace. The non-hugetlb
implementation of UFFDIO_WRITEPROTECT seems to also have this problem.

Peter, what do you think?

>
> To me, the fact it was somewhat difficult to come up with this scenario is an
> argument what we should just unshare at split time as you propose.  Who
> knows what other issues may exist.
>
> > 60dfaad65a ("mm/hugetlb: allow uffd wr-protect none ptes") is the
> > commit that introduced the WARN_ON_ONCE; perhaps it's a good choice
> > for a Fixes: tag (if above is indeed true).
>
> If the key issue in your above scenario is indeed the failure of
> hugetlb_unshare_all_pmds in the non-PUD_SIZE vma, then perhaps we tag?
>
> 6dfeaff93be1 ("hugetlb/userfaultfd: unshare all pmds for hugetlbfs when
> register wp")

SGTM. Thanks Mike.
#define _GNU_SOURCE

#include <unistd.h>
#include <fcntl.h>
#include <linux/memfd.h>
#include <linux/mman.h>
#include <sys/mman.h>
#include <linux/userfaultfd.h>
#include <linux/errno.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <stdio.h>
#include <sys/syscall.h>
#include <sys/ioctl.h>

#define PAGES_PER_GIG 512
#define HUGE_PAGE_SIZE (2UL << 20)
#define GIG_MASK ~(PAGES_PER_GIG * HUGE_PAGE_SIZE - 1)

void fault_in_write(char *mapping, size_t len)
{
	volatile char *mapping_v = mapping;
	for (size_t i = 0; i < len; i += 4096)
		mapping_v[i] = 1;
}

int main()
{
	int fd = memfd_create("test", MFD_HUGETLB);
	size_t len = 2 * PAGES_PER_GIG * HUGE_PAGE_SIZE;
	char *mapping;
	char *mapping_to_use;
	int uffd;
	int pid;
	struct uffdio_api uffdio_api;
	struct uffdio_register uffdio_register;
	struct uffdio_writeprotect uffdio_writeprotect;

	if (ftruncate(fd, len)) {
		perror("ftruncate failed");
		return -1;
	}
	mapping = mmap(NULL, len, PROT_WRITE | PROT_READ, MAP_SHARED, fd, 0);
	if (mapping == MAP_FAILED) {
		perror("mmap failed");
		return -1;
	}

	mapping_to_use = mapping;
	if ((unsigned long)mapping & ~GIG_MASK) {
		mapping_to_use = (char *)((unsigned long)mapping & GIG_MASK) + (len/2);
	}

	pid = fork();
	if (pid < 0) {
		perror("fork failed");
		return -1;
	}

	fault_in_write(mapping, len);

	if (pid > 0) {
		int status;
		waitpid(pid, &status, 0);
		return WEXITSTATUS(status);
	}

	uffd = syscall(SYS_userfaultfd, O_CLOEXEC);
	uffdio_api.api = UFFD_API;
	uffdio_api.features = UFFD_FEATURE_SIGBUS;
	uffdio_api.ioctls = 0;
	if (ioctl(uffd, UFFDIO_API, &uffdio_api)) {
		perror("UFFDIO_API failed");
		return -1;
	}

	uffdio_register.range.start = (unsigned long)mapping_to_use;
	uffdio_register.range.len = len/4;
	uffdio_register.mode = UFFDIO_REGISTER_MODE_WP;

	if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) {
		perror("UFFDIO_REGISTER failed");
		return -1;
	}

	uffdio_writeprotect.range.start = (unsigned long)mapping_to_use;
	uffdio_writeprotect.range.len = len/4;
	uffdio_writeprotect.mode = UFFDIO_WRITEPROTECT_MODE_WP;
	if (ioctl(uffd, UFFDIO_WRITEPROTECT, &uffdio_writeprotect)) {
		perror("UFFDIO_WRITEPROTECT failed");
		return -1;
	}

	/* If correct: should SIGBUS */
	fault_in_write(mapping_to_use, 1);

	printf("BUG: no sigbus\n");
	return 0;
}

[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux