RE: [PATCH v1 12/23] KVM: VMX: Handle FRED event data

"Li, Xin3" <xin3.li@xxxxxxxxx> · Tue, 14 Nov 2023 04:34:02 +0000

> >+			else if (is_nm_fault(intr_info) &&
> >+				 vcpu->arch.guest_fpu.fpstate->xfd)
> 
> does this necessarily mean the #NM is caused by XFD?

Then the event data should be 0.  Or I missed something obvious? I.e.,
it can be easily differentiated and we should just explicitly set it
to 0.

> 
> >+				event_data = vcpu->arch.guest_fpu.xfd_err;
> >+
> >+			vmcs_write64(INJECTED_EVENT_DATA, event_data);
> >+		}
> >+	}
> >+
> > 	vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr_info);
> >
> > 	vmx_clear_hlt(vcpu);
> >@@ -7226,7 +7247,8 @@ static void vmx_recover_nmi_blocking(struct
> >vcpu_vmx *vmx)  static void __vmx_complete_interrupts(struct kvm_vcpu *vcpu,
> > 				      u32 idt_vectoring_info,
> > 				      int instr_len_field,
> >-				      int error_code_field)
> >+				      int error_code_field,
> >+				      int event_data_field)
> 
> event_data_field is used to indicate whether this is a "cancel". I may think it is
> better to simply use a boolean e.g., bool cancel.

I'm fine with the idea if no objections.

> 
> 
> > {
> > 	u8 vector;
> > 	int type;
> >@@ -7260,6 +7282,37 @@ static void __vmx_complete_interrupts(struct
> kvm_vcpu *vcpu,
> > 		vcpu->arch.event_exit_inst_len = vmcs_read32(instr_len_field);
> > 		fallthrough;
> > 	case INTR_TYPE_HARD_EXCEPTION:
> >+		if (kvm_is_fred_enabled(vcpu) && event_data_field) {
> >+			/*
> >+			 * Save original-event data for being used as injected-
> event data.
> >+			 */
> 
> Looks we also expect CPU will update CR2/DR6/XFD_ERR. this hunk looks to me
> just a paranoid check to ensure the cpu works as expected. if that's the case, I
> suggest documenting it a bit in the comment.

These checks are not intended for hw, they make sure nVMX FRED is correctly
implemented and catch regressions.

And yes, in the early stage, I prefer to be paranoid.

> 
> >+			u64 event_data = vmcs_read64(event_data_field);
> >+
> >+			switch (vector) {
> >+			case DB_VECTOR:
> >+				get_debugreg(vcpu->arch.dr6, 6);
> >+				WARN_ON(vcpu->arch.dr6 != (event_data ^
> DR6_RESERVED));
> >+				vcpu->arch.dr6 = event_data ^ DR6_RESERVED;
> >+				break;
> >+			case NM_VECTOR:
> >+				if (vcpu->arch.guest_fpu.fpstate->xfd) {
> >+					rdmsrl(MSR_IA32_XFD_ERR, vcpu-
> >arch.guest_fpu.xfd_err);
> >+					WARN_ON(vcpu-
> >arch.guest_fpu.xfd_err != event_data);
> >+					vcpu->arch.guest_fpu.xfd_err =
> event_data;
> >+				} else {
> >+					WARN_ON(event_data != 0);
> >+				}
> >+				break;
> >+			case PF_VECTOR:
> >+				WARN_ON(vcpu->arch.cr2 != event_data);
> >+				vcpu->arch.cr2 = event_data;
> >+				break;
> >+			default:
> >+				WARN_ON(event_data != 0);
> 
> I am not sure if this WARN_ON() can be triggeded by nested VMX. It is legitimate
> for L1 VMM to inject any event w/ an event_data.
> 
> FRED spec says:
> 
> Section 5.2.1 specifies the event data that FRED event delivery of certain events
> saves on the stack. When FRED event delivery is used for an event injected by VM
> entry, the event data saved is the value of the injected-event-data field in the
> VMCS. This value is used instead of what is specified in Section 5.2.1 and is done
> for __ALL__ injected events using FRED event delivery

5.2.1 Saving Information on the Regular Stack also says:
- For any other event, the event data are not currently defined and will
  be zero until they are.

Or you mean something else?