On Fri, 2019-10-04 at 07:56 -0700, Andy Lutomirski wrote: > On Thu, Oct 3, 2019 at 2:38 PM Rick Edgecombe > <rick.p.edgecombe@xxxxxxxxx> wrote: > > > > This patchset enables the ability for KVM guests to create execute-only (XO) > > memory by utilizing EPT based XO permissions. XO memory is currently > > supported > > on Intel hardware natively for CPU's with PKU, but this enables it on older > > platforms, and can support XO for kernel memory as well. > > The patchset seems to sometimes call this feature "XO" and sometimes > call it "NR". To me, XO implies no-read and no-write, whereas NR > implies just no-read. Can you please clarify *exactly* what the new > bit does and be consistent? > > I suggest that you make it NR, which allows for PROT_EXEC and > PROT_EXEC|PROT_WRITE and plain PROT_WRITE. WX is of dubious value, > but I can imagine plain W being genuinely useful for logging and for > JITs that could maintain a W and a separate X mapping of some code. > In other words, with an NR bit, all 8 logical access modes are > possible. Also, keeping the paging bits more orthogonal seems nice -- > we already have a bit that controls write access. Sorry, yes the behavior of this bit needs to be documented a lot better. I will definitely do this for the next version. To clarify, since the EPT permissions in the XO/NR range are executable, and not readable or writeable the new bit really means XO, but only when NX is 0 since the guest page tables are being checked as well. When NR=1, W=1, and NX=0, the memory is still XO. NR was picked over XO because as you say. The idea is that it can be defined that in the case of KVM XO, NR and writable is not a valid combination, like writeable but not readable is defined as not valid for the EPT. I *think* whenever NX=1, NR=1 it should be similar to not present in that it can't be used for anything or have its translation cached. I am not 100% sure on the cached part and was thinking of just making the "spec" that the translation caching behavior is undefined. I can look into this if anyone thinks we need to know. In the current patchset it shouldn't be possible to create this combination. Since write-only memory isn't supported in EPT we can't do the same trick to create a new HW permission. But I guess if we emulate it, we could make the new bit mean just NR, and support write-only by allowing emulation when KVM gets a write EPT violations to NR memory. It might still be useful for the JIT case you mentioned, or a shared memory mailbox. On the other hand, userspace might be surprised to encounter that memory is different speeds depending on the permission. I also wonder if any userspace apps are asking for just PROT_WRITE and expecting readable memory. Thanks, Rick