On Thu, May 28, 2020 at 10:42:38AM +0200, Vitaly Kuznetsov wrote: > Vivek Goyal <vgoyal@xxxxxxxxxx> writes: > > > On Mon, May 25, 2020 at 04:41:17PM +0200, Vitaly Kuznetsov wrote: > >> > > > > [..] > >> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > >> index 0a6b35353fc7..c195f63c1086 100644 > >> --- a/arch/x86/include/asm/kvm_host.h > >> +++ b/arch/x86/include/asm/kvm_host.h > >> @@ -767,7 +767,7 @@ struct kvm_vcpu_arch { > >> u64 msr_val; > >> u32 id; > >> bool send_user_only; > >> - u32 host_apf_reason; > >> + u32 host_apf_flags; > > > > Hi Vitaly, > > > > What is host_apf_reason used for. Looks like it is somehow used in > > context of nested guests. I hope by now you have been able to figure > > it out. > > > > Is it somehow the case of that L2 guest takes a page fault exit > > and then L0 injects this event in L1 using exception. I have been > > trying to read this code but can't wrap my head around it. > > > > I am still concerned about the case of nested kvm. We have discussed > > apf mechanism but never touched nested part of it. Given we are > > touching code in nested kvm part, want to make sure it is not broken > > in new design. > > > > Sorry I missed this. > > I think we've touched nested topic a bit already: > https://lore.kernel.org/kvm/87lfluwfi0.fsf@xxxxxxxxxxxxxxxxxxxx/ > > But let me try to explain the whole thing and maybe someone will point > out what I'm missing. Hi Vitaly, Sorry, I got busy in some other things. Got back to it now. Thanks for the explanation. I think I understand it up to some extent now. Vivek > > The problem being solved: L2 guest is running and it is hitting a page > which is not present *in L0* and instead of pausing *L1* vCPU completely > we want to let L1 know about the problem so it can run something else > (e.g. another guest or just another application). > > What's different between this and 'normal' APF case. When L2 guest is > running, the CPU (physical) is in 'guest' mode so we can't inject #PF > there. Actually, we can but L2 may get confused and we're not even sure > it's L2's fault, that L2 supported APF and so on. We want to make L1 > deal with the issue. > > How does it work then. We inject #PF and L1 sees it as #PF VMEXIT. It > needs to know about APF (thus KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT) but > the handling is exactly the same as do_pagefault(): L1's > kvm_handle_page_fault() checkes APF area (shared between L0 and L1) and > either pauses a task or resumes a previously paused one. This can be a > L2 guest or something else. > > What is 'host_apf_reason'. It is a copy of 'reason' field from 'struct > kvm_vcpu_pv_apf_data' which we read upon #PF VMEXIT. It indicates that > the #PF VMEXIT is synthetic. > > How does it work with the patchset: 'page not present' case remains the > same. 'page ready' case now goes through interrupts so it may not get > handled immediately. External interrupts will be handled by L0 in host > mode (when L2 is not running). For the 'page ready' case L1 hypervisor > doesn't need any special handling, kvm_async_pf_intr() irq handler will > work correctly. > > I've smoke tested this with VMX and nothing immediately blew up. > > -- > Vitaly >