On Wed, Apr 19, 2023 at 6:51 AM Hoo Robert <robert.hoo.linux@xxxxxxxxx> wrote: > > On 4/13/2023 5:34 AM, Anish Moorthy wrote: > > At the moment, demand_paging_test does not support profiling/testing > > multiple vCPU threads concurrently faulting on a single uffd because > > > > (a) "-u" (run test in userfaultfd mode) creates a uffd for each vCPU's > > region, so that each uffd services a single vCPU thread. > > (b) "-u -o" (userfaultfd mode + overlapped vCPU memory accesses) > > simply doesn't work: the test tries to register the same memory > > to multiple uffds, causing an error. > > > > Add support for many vcpus per uffd by > > (1) Keeping "-u" behavior unchanged. > > (2) Making "-u -a" create a single uffd for all of guest memory. > > (3) Making "-u -o" implicitly pass "-a", solving the problem in (b). > > In cases (2) and (3) all vCPU threads fault on a single uffd. > > > > With multiple potentially multiple vCPU per UFFD, it makes sense to > ^^^^^^^^ > redundant "multiple"? Thanks, fixed > > --- a/tools/testing/selftests/kvm/demand_paging_test.c > > +++ b/tools/testing/selftests/kvm/demand_paging_test.c > > @@ -77,9 +77,15 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, > > copy.mode = 0; > > > > r = ioctl(uffd, UFFDIO_COPY, ©); > > - if (r == -1) { > > - pr_info("Failed UFFDIO_COPY in 0x%lx from thread %d with errno: %d\n", > > - addr, tid, errno); > > + /* > > + * With multiple vCPU threads fault on a single page and there are > > + * multiple readers for the UFFD, at least one of the UFFDIO_COPYs > > + * will fail with EEXIST: handle that case without signaling an > > + * error. > > + */ > > But this code path is also gone through in other cases, isn't it? In > those cases, is it still safe to ignore EEXIST? Good point: the answer is no, it's not always safe to ignore EEXISTs here. For instance the first UFFDIO_CONTINUE for a page shouldn't be allowed to EEXIST, and that's swept under the rug here. I've added the following to the comment + * Note that this does sweep under the rug any EEXISTs occurring + * from, e.g., the first UFFDIO_COPY/CONTINUEs on a page. A + * realistic VMM would maintain some other state to correctly + * surface EEXISTs to userspace or prevent duplicate + * COPY/CONTINUEs from happening in the first place. I could add that extra state to the self test (via for instance, an atomic bitmap that threads "or" into before issuing any COPY/CONTINUEs) but it's a bit of an extra complication without any real payoff. Let me know if you think the comment's inadequate though. > > + if (r == -1 && errno != EEXIST) { > > + pr_info("Failed UFFDIO_COPY in 0x%lx from thread %d, errno = %d\n", > > + addr, tid, errno); > > unintended indent changes I think. > > > return r; > > } > > } else if (uffd_mode == UFFDIO_REGISTER_MODE_MINOR) { > > @@ -89,9 +95,10 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, > > cont.range.len = demand_paging_size; > > > > r = ioctl(uffd, UFFDIO_CONTINUE, &cont); > > - if (r == -1) { > > - pr_info("Failed UFFDIO_CONTINUE in 0x%lx from thread %d with errno: %d\n", > > - addr, tid, errno); > > + /* See the note about EEXISTs in the UFFDIO_COPY branch. */ > > Personally I would suggest copy the comments here. what if some day above > code/comment was changed/deleted? You might be right: on the other hand, if the comment ever gets updated then it would have to be done in two places. Anyone to break the tie? :) > > @@ -148,18 +158,32 @@ struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, > > TEST_ASSERT((uffdio_register.ioctls & expected_ioctls) == > > expected_ioctls, "missing userfaultfd ioctls"); > > > > - ret = pipe2(uffd_desc->pipefds, O_CLOEXEC | O_NONBLOCK); > > - TEST_ASSERT(!ret, "Failed to set up pipefd"); > > - > > uffd_desc->uffd_mode = uffd_mode; > > uffd_desc->uffd = uffd; > > uffd_desc->delay = delay; > > uffd_desc->handler = handler; > > Now that these info are encapsulated into reader args below, looks > unnecessary to have them in uffd_desc here. Good point. I've removed uffd_mode, delay, and handler from uffd_desc. I left the "uffd" field in because that's a shared resource, and close()ing it as "close(desc->uffd)" makes more sense than, say, "close(desc->reader_args[0].uffd)"