Do we have test cases with guest.wp=0 in KVM test suite? Thanks! -Xin > -----Original Message----- > From: Avi Kivity [mailto:avi@xxxxxxxxxx] > Sent: Monday, June 06, 2011 9:19 PM > To: Marcelo Tosatti; kvm@xxxxxxxxxxxxxxx; Yang, Wei Y; Shan, Haitao; Li, Xin > Subject: [PATCH] KVM: Adjust shadow paging to work when SMEP=1 and CR0.WP=0 > > When CR0.WP=0, we sometimes map user pages as kernel pages (to allow > the kernel to write to them). Unfortunately this also allows the kernel > to fetch from these pages, even if CR4.SMEP is set. > > Adjust for this by also setting NX on the spte in these circumstances. > > Signed-off-by: Avi Kivity <avi@xxxxxxxxxx> > --- > > Turned out a little more complicated than I thought. > > Documentation/virtual/kvm/mmu.txt | 18 ++++++++++++++++++ > arch/x86/include/asm/kvm_host.h | 1 + > arch/x86/kvm/mmu.c | 14 +++++++++++++- > 3 files changed, 32 insertions(+), 1 deletions(-) > > diff --git a/Documentation/virtual/kvm/mmu.txt > b/Documentation/virtual/kvm/mmu.txt > index f46aa58..5dc972c 100644 > --- a/Documentation/virtual/kvm/mmu.txt > +++ b/Documentation/virtual/kvm/mmu.txt > @@ -165,6 +165,10 @@ Shadow pages contain the following information: > Contains the value of efer.nxe for which the page is valid. > role.cr0_wp: > Contains the value of cr0.wp for which the page is valid. > + role.smep_andnot_wp: > + Contains the value of cr4.smep && !cr0.wp for which the page is valid > + (pages for which this is true are different from other pages; see the > + treatment of cr0.wp=0 below). > gfn: > Either the guest page table containing the translations shadowed by this > page, or the base page frame for linear translations. See role.direct. > @@ -317,6 +321,20 @@ on fault type: > > (user write faults generate a #PF) > > +In the first case there is an additional complication if CR4.SMEP is > +enabled: since we've turned the page into a kernel page, the kernel may now > +execute it. We handle this by also setting spte.nx. If we get a user > +fetch or read fault, we'll change spte.u=1 and spte.nx=gpte.nx back. > + > +To prevent an spte that was converted into a kernel page with cr0.wp=0 > +from being written by the kernel after cr0.wp has changed to 1, we make > +the value of cr0.wp part of the page role. This means that an spte created > +with one value of cr0.wp cannot be used when cr0.wp has a different value - > +it will simply be missed by the shadow page lookup code. A similar issue > +exists when an spte created with cr0.wp=0 and cr4.smep=0 is used after > +changing cr4.smep to 1. To avoid this, the value of !cr0.wp && cr4.smep > +is also made a part of the page role. > + > Large pages > =========== > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index fc38eca..c7e7f53 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -205,6 +205,7 @@ union kvm_mmu_page_role { > unsigned invalid:1; > unsigned nxe:1; > unsigned cr0_wp:1; > + unsigned smep_andnot_wp:1; > }; > }; > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index 2d14434..8aaaa23f 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -1985,8 +1985,17 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep, > spte |= PT_WRITABLE_MASK; > > if (!vcpu->arch.mmu.direct_map > - && !(pte_access & ACC_WRITE_MASK)) > + && !(pte_access & ACC_WRITE_MASK)) { > spte &= ~PT_USER_MASK; > + /* > + * If we converted a user page to a kernel page, > + * so that the kernel can write to it when cr0.wp=0, > + * then we should prevent the kernel from executing it > + * if SMEP is enabled. > + */ > + if (!kvm_read_cr4_bits(vcpu, X86_CR4_SMEP)) > + spte |= PT64_NX_MASK; > + } > > /* > * Optimization: for pte sync, if spte was writable the hash > @@ -2955,6 +2964,7 @@ static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu) > int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context) > { > int r; > + bool smep = kvm_read_cr4_bits(vcpu, X86_CR4_SMEP); > ASSERT(vcpu); > ASSERT(!VALID_PAGE(vcpu->arch.mmu.root_hpa)); > > @@ -2969,6 +2979,8 @@ int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, > struct kvm_mmu *context) > > vcpu->arch.mmu.base_role.cr4_pae = !!is_pae(vcpu); > vcpu->arch.mmu.base_role.cr0_wp = is_write_protection(vcpu); > + vcpu->arch.mmu.base_role.smep_andnot_wp > + = smep && !is_write_protection(vcpu); > > return r; > } > -- > 1.7.5.3 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html