Re: [PATCH 3/7] KVM: TDX: Retry locally in TDX EPT violation handler on RET_PF_RETRY

Yan Zhao <yan.y.zhao@xxxxxxxxx> · Mon, 20 Jan 2025 16:05:06 +0800



On Fri, Jan 17, 2025 at 01:14:02PM -0800, Sean Christopherson wrote:
> On Mon, Jan 13, 2025, Yan Zhao wrote:
> > @@ -1884,7 +1904,24 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu)
> >  	}
> >  
> >  	trace_kvm_page_fault(vcpu, tdexit_gpa(vcpu), exit_qual);
> > -	return __vmx_handle_ept_violation(vcpu, tdexit_gpa(vcpu), exit_qual);
> > +
> > +	while (1) {
> > +		ret = __vmx_handle_ept_violation(vcpu, gpa, exit_qual);
> > +
> > +		if (ret != RET_PF_RETRY || !local_retry)
> > +			break;
> > +
> > +		/*
> > +		 * Break and keep the orig return value.
> 
> Wrap at 80.
> 
> > +		 * Signal & irq handling will be done later in vcpu_run()
> 
> Please don't use "&" as shorthand.  It saves all of two characters.  That said,
Got it!

> I don't see any point in adding this comment, if the reader can't follow the
> logic of this code, these comments aren't going to help them.  And the comment
> about vcpu_run() in particular is misleading, as posted interrupts aren't truly
> handled by vcpu_run(), rather they're handled by hardware (although KVM does send
> a self-IPI).
What about below version?

"
Bail out the local retry
- for pending signal, so that vcpu_run() --> xfer_to_guest_mode_handle_work()
  --> kvm_handle_signal_exit() can exit to userspace for signal handling.
- for pending interrupts, so that tdx_vcpu_enter_exit() --> tdh_vp_enter() will
  be re-executed for interrupt injection through posted interrupt.
- for pending nmi or KVM_REQ_NMI, so that vcpu_enter_guest() will be
  re-executed to process and pend NMI to the TDX module. KVM always regards NMI
  as allowed and the TDX module will inject it when NMI is allowed in the TD.
"

> > +		 */
> > +		if (signal_pending(current) || pi_has_pending_interrupt(vcpu) ||
> > +		    kvm_test_request(KVM_REQ_NMI, vcpu) || vcpu->arch.nmi_pending)
> 
> This needs to check that the IRQ/NMI is actually allowed.  I guess it doesn't
> matter for IRQs, but it does matter for NMIs.  Why not use kvm_vcpu_has_events()?
Yes. However, vt_nmi_allowed() is always true for TDs.
For interrupt, tdx_interrupt_allowed() is always true unless the exit reason is
EXIT_REASON_HLT. For the EPT violation handler, the exit reason should not be
EXIT_REASON_HLT.

> Ah, it's a local function.  At a glance, I don't see any harm in exposing that
> to TDX.
Besides that kvm_vcpu_has_events() is a local function, the consideration to
check "pi_has_pending_interrupt() || kvm_test_request(KVM_REQ_NMI, vcpu) ||
vcpu->arch.nmi_pending" instead that
(1) the two are effectively equivalent for TDs (as nested is not supported yet)
(2) kvm_vcpu_has_events() may lead to unnecessary breaks due to exception
    pending. However, vt_inject_exception() is NULL for TDs.

> > +			break;
> > +
> > +		cond_resched();
> > +	}
> 
> Nit, IMO this reads better as:
> 
> 	do {
> 		ret = __vmx_handle_ept_violation(vcpu, gpa, exit_qual);
> 	} while (ret == RET_PF_RETY && local_retry &&
> 		 !kvm_vcpu_has_events(vcpu) && !signal_pending(current));
>
Hmm, the previous way can save one "cond_resched()" for the common cases, i.e.,
when ret != RET_PF_RETRY or when gpa is shared .

> > +	return ret;
> >  }
> >  
> >  int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t fastpath)
> > -- 
> > 2.43.2
> >