Re: [RFC PATCH 13/21] KVM: X86: Handle private MMIO as shared

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 02, 2024 at 12:22:56PM +1000, Alexey Kardashevskiy wrote:
> 
> 
> On 31/8/24 02:57, Xu Yilun wrote:
> > On Fri, Aug 23, 2024 at 11:21:27PM +1000, Alexey Kardashevskiy wrote:
> > > Currently private MMIO nested page faults are not expected so when such
> > > fault occurs, KVM tries moving the faulted page from private to shared
> > > which is not going to work as private MMIO is not backed by memfd.
> > > 
> > > Handle private MMIO as shared: skip page state change and memfd
> > 
> > This means host keeps the mapping for private MMIO, which is different
> > from private memory. Not sure if it is expected, and I want to get
> > some directions here.
> 
> There is no other translation table on AMD though, the same NPT. The

Sorry for not being clear, when I say "host mapping" I mean host
userspace mapping (host CR3 mapping). By using guest_memfd, there is no
host CR3 mapping for private memory. I'm wondering if we could keep host
CR3 mapping for private MMIO.

> security is enforced by the RMP table. A device says "bar#x is private" so
> the host + firmware ensure the each corresponding RMP entry is "assigned" +
> "validated" and has a correct IDE stream ID and ASID, and the VM's kernel
> maps it with the Cbit set.
> 
> >  From HW perspective, private MMIO is not intended to be accessed by
> > host, but the consequence may varies. According to TDISP spec 11.2,
> > my understanding is private device (known as TDI) should reject the
> > TLP and transition to TDISP ERROR state. But no further error
> > reporting or logging is mandated. So the impact to the host system
> > is specific to each device. In my test environment, an AER
> > NonFatalErr is reported and nothing more, much better than host
> > accessing private memory.
> 
> afair I get an non-fatal RMP fault so the device does not even notice.
> 
> > On SW side, my concern is how to deal with mmu_notifier. In theory, if
> > we get pfn from hva we should follow the userspace mapping change. But
> > that makes no sense. Especially for TDX TEE-IO, private MMIO mapping
> > in SEPT cannot be changed or invalidated as long as TDI is running.
> 
> > Another concern may be specific for TDX TEE-IO. Allowing both userspace
> > mapping and SEPT mapping may be safe for private MMIO, but on
> > KVM_SET_USER_MEMORY_REGION2,  KVM cannot actually tell if a userspace
> > addr is really for private MMIO. I.e. user could provide shared memory
> > addr to KVM but declare it is for private MMIO. The shared memory then
> > could be mapped in SEPT and cause problem.
> 
> I am missing lots of context here. When you are starting a guest with a
> passed through device, until the TDISP machinery transitions the TDI into
> RUN, this TDI's MMIO is shared and mapped everywhere. And after

Yes, that's the situation nowadays. I think if we need to eliminate
host CR3 mapping for private MMIO, a simple way is we don't allow host
CR3 mapping at the first place, even for shared pass through. It is
doable cause:

 1. IIUC, host CR3 mapping for assigned MMIO is only used for pfn
    finding, i.e. host doesn't really (or shouldn't?) access them.
 2. The hint from guest_memfd shows KVM doesn't have to rely on host
    CR3 mapping to find pfn.

> transitioning to RUN you move mappings from EPT to SEPT?

Mostly correct, TDX move mapping from EPT to SEPT after LOCKED and
right before RUN.

> 
> > So personally I prefer no host mapping for private MMIO.
> 
> Nah, cannot skip this step on AMD. Thanks,

Not sure if we are on the same page. I assume from HW perspective, host
CR3 mapping is not necessary for NPT/RMP build?

Thanks,
Yilun

> 
> 
> > 
> > Thanks,
> > Yilun
> > 
> > > page state tracking.
> > > 
> > > The MMIO KVM memory slot is still marked as shared as the guest can
> > > access it as private or shared so marking the MMIO slot as private
> > > is not going to help.
> > > 
> > > Signed-off-by: Alexey Kardashevskiy <aik@xxxxxxx>
> > > ---
> > >   arch/x86/kvm/mmu/mmu.c | 6 +++++-
> > >   1 file changed, 5 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > index 928cf84778b0..e74f5c3d0821 100644
> > > --- a/arch/x86/kvm/mmu/mmu.c
> > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > @@ -4366,7 +4366,11 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
> > >   {
> > >   	bool async;
> > > -	if (fault->is_private)
> > > +	if (fault->slot && fault->is_private && !kvm_slot_can_be_private(fault->slot) &&
> > > +	    (vcpu->kvm->arch.vm_type == KVM_X86_SNP_VM))
> > > +		pr_warn("%s: private SEV TIO MMIO fault for fault->gfn=%llx\n",
> > > +			__func__, fault->gfn);
> > > +	else if (fault->is_private)
> > >   		return kvm_faultin_pfn_private(vcpu, fault);
> > >   	async = false;
> > > -- 
> > > 2.45.2
> > > 
> > > 
> 
> -- 
> Alexey
> 




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux