> > Hi Sean, > > After more thoughts, do you think checking KVM internal memslot is necessary? > > I don't think it's necessary per se, but I also can't think of any reason to allow > it. > > > slot = gfn_to_memslot(kvm, gfn); > > if (!slot || slot->id >= KVM_USER_MEM_SLOTS) { > > srcu_read_unlock(&kvm->srcu, idx); > > return -EINVAL; > > } > > > > Do we allow write tracking to APIC access page when APIC-write VM exit > > is not desired? > > Allow? Yes. > > But KVM doesn't use write-tracking for anything APICv related, e.g. to disable > APICv, KVM instead zaps the SPTEs for the APIC access page and on page fault goes > straight to MMIO emulation. > > Theoretically, the guest could create an intermediate PTE in the APIC access page > and AFAICT KVM would shadow the access and write-protect the APIC access page. > But that's benign as the resulting emulation would be handled just like emulated > APIC MMIO. > > FWIW, the other internal memslots, TSS and idenity mapped page tables, are used > if and only if paging is disabled in the guest, i.e. there are no guest PTEs for > KVM to shadow (and paging must be enabled to enable VMX, so nested EPT is also > ruled out). So this is theoretically possible only for the APIC access page. > That changes with KVMGT, but that again should not be problematic. KVM will > emulate in response to the write-protected page and things go on. E.g. it's > arguably much weirder that the guest can read/write the identity mapped page > tables that are used for EPT without unrestricted guest. > > There's no sane reason to allow creating PTEs in the APIC page, but I'm also not > all that motivated to "fix" things. account_shadowed() isn't expected to fail, > so KVM would need to check further up the stack, e.g. in walk_addr_generic() by > open coding a form of kvm_vcpu_gfn_to_hva_prot(). > > I _think_ that's the only place KVM would need to add a check, as KVM already > checks that the root, i.e. CR3, is in a "visible" memslot. I suppose KVM could > just synthesize triple fault, like it does for the root/CR3 case, but I don't > like making up behavior. > > In other words, I'm not opposed to disallowing write-tracking internal memslots, > but I can't think of anything that will break, and so for me personally at least, > the ROI isn't sufficient to justify writing tests and dealing with any fallout. It makes sense. Thanks for the explanation.