On Wed, May 10, 2023 at 2:50 PM Peter Xu <peterx@xxxxxxxxxx> wrote: > On Tue, May 09, 2023 at 01:52:05PM -0700, Anish Moorthy wrote: > > On Sun, May 7, 2023 at 6:23 PM Peter Xu <peterx@xxxxxxxxxx> wrote: > > What I wanted to do is to understand whether there's still chance to > provide a generic solution. I don't know why you have had a bunch of pmu > stack showing in the graph, perhaps you forgot to disable some of the perf > events when doing the test? Let me know if you figure out why it happened > like that (so far I didn't see), but I feel guilty to keep overloading you > with such questions. > > The major problem I had with this series is it's definitely not a clean > approach. Say, even if you'll all rely on userapp you'll still need to > rely on userfaultfd for kernel traps on corner cases or it just won't work. > IIUC that's also the concern from Nadav. This is a long thread, so apologies if the following has already been discussed. Would per-tid userfaultfd support be a generic solution? i.e. Allow userspace to create a userfaultfd that is tied to a specific task. Any userfaults encountered by that task use that fd, rather than the process-wide fd. I'm making the assumption here that each of these fds would have independent signaling mechanisms/queues and so this would solve the scaling problem. A VMM could use this to create 1 userfaultfd per vCPU and 1 thread per vCPU for handling userfault requests. This seems like it'd have roughly the same scalability characteristics as the KVM -EFAULT approach.