On 21 November 2014 20:14, Andrea Arcangeli <aarcange@xxxxxxxxxx> wrote: > Hi Peter, > > On Wed, Oct 29, 2014 at 05:56:59PM +0000, Peter Maydell wrote: >> On 29 October 2014 17:46, Andrea Arcangeli <aarcange@xxxxxxxxxx> wrote: >> > After some chat during the KVMForum I've been already thinking it >> > could be beneficial for some usage to give userland the information >> > about the fault being read or write >> >> ...I wonder if that would let us replace the current nasty >> mess we use in linux-user to detect read vs write faults >> (which uses a bunch of architecture-specific hacks including >> in some cases "look at the insn that triggered this SEGV and >> decode it to see if it was a load or a store"; see the >> various cpu_signal_handler() implementations in user-exec.c). > > There's currently no plan to deliver to userland read access > notifications of a present page, simply because the task of the > userfaultfd is to handle the page fault in userland, but if the page > is mapped and readable it won't fault in the first place :). I just > mean it's not like gdb read watch. If it's mapped and readable-but-not-writable then it should still fault on write accesses, though? These are cases we currently get SEGV for, anyway. > Even if the region would be set to PROT_NONE it would still SEGV > without triggering an userfault (after all pte_present would still > true because the page is still mapped despite not being readable, so > in any case it wouldn't be considered a not-present page fault). Ah, I guess we have a terminology difference. I was considering "page fault" to mean (roughly) "anything that causes the CPU to take an exception on an attempted load/store" and expected that userfaultfd would notify userspace of any of those. (Well, not alignment faults, maybe, but I'm definitely surprised that access permission issues don't get reported the same way as page-completely-missing issues. In other words I was expecting that this was "everything previously reported via SIGSEGV or SIGBUS now comes via userfaultfd".) > Temporarily removing/moving the page with remap_anon_pages shall be > much better than using PROT_NONE for this (or alternative syscall name > to differentiate it further from remap_file_pages, or equivalent > userfaultfd command if we decide to hide the pte/pmd mangling as > userfaultfd commands instead of adding new standalone syscalls). We don't use PROT_NONE for the linux-user situation, we just use mprotect() to remove the PAGE_WRITE permission so it's still readable. I suspect actually linux-user would be better off implementing something like "if this is a page which we've mapped read-only because we translated code out of it, then go ahead and remap it r/w and throw away the translation and retry the access, otherwise report SEGV to the guest", because taking SEGVs shouldn't be a fast path in the guest binary. That would let us work without architecture-specific junk and without requiring new kernel features either. So you can ignore this whole tangent thread :-) thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html