On 2022/6/8 16:10, Jarkko Sakkinen wrote: > LGTM, I'll have to check if I'm able to trigger MCE with > /sys/devices/system/memory/hard_offline_page, as hinted by Tony. > > Just trying to think how to get a legit PFN number. I guess one workable > way is to attach kretprobe to sgx_alloc_epc_page(), and do similar > conversion as in sgx_get_epc_phys_addr() for ((struct sgx_epc_page > *)retval) and print it out. > We follow the hint in Documentation/firmware-guide/acpi/apei/einj.rst added by Tony. To validate the part for virtualization, we do step 1~2 on host, do step 3~7 in VM. Regarding to how to get the SGX EPC page mappings among GVA -> GPA -> HPA, we do something like these: 1. Get GVA -> GPA in guest OS 1) Find the probe point in sgx_vma_fault(), as vmf_insert_pfn() only be call once in sgx_vma_fault(): crash> dis sgx_vma_fault | grep vmf_insert_pfn 0xffffffff8ce527b1 <sgx_vma_fault+113>: callq 0xffffffff8d0ec1d0 <vmf_insert_pfn> 2) Get the mapping of GVA to guest PFN echo 'p:sgxvmfault sgx_vma_fault+113 vaddr=%si pfn=%dx' >> /sys/kernel/debug/tracing/kprobe_events cat /sys/kernel/debug/tracing/kprobe_events echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable cat /sys/kernel/debug/tracing/trace_pipe 2. Get GPA -> HPA on host OS __sgx_vepc_fault() can tell us the mapping of HVA -> HPA, but to inject a memory failure, we need GPA -> HPA. There are several ways can archive this, e.g., - patch Qemu to show GPA -> HVA, then we can easily convert HVA -> HPA - Walk EPT table - patch kernel to show GPA -> HPA We use the last one because it's most straightforward. @@ -4047,6 +4047,8 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault else r = __direct_map(vcpu, fault); + if (!!sgx_paddr_to_page(fault->pfn << PAGE_SHIFT)) + trace_printk("SGX: gpa:0x%llx hpa:0x%llx\n", fault->gfn << PAGE_SHIFT, fault->pfn << PAGE_SHIFT); out_unlock: if (is_tdp_mmu_fault) read_unlock(&vcpu->kvm->mmu_lock); (Because the filter of ftrace kprobe cannot support such a complex expression, so we have to patch the host kernel directly.) Then we get the mappings of GVA -> GPA -> HPA, next we can inject real errors into enclave memory using ACPI/EINJ. Try to touch the GVA in guest OS will trigger the bug and see how the patch 02 work. Finally, Qemu console will show below message but will not be killed: qemu-system-x86_64: Guest MCE Memory Error at QEMU addr 0x7f3273f2a000 and GUEST addr 0x18012b000 of type BUS_MCEERR_AR injected Best Regards, Zhiquan > BR, Jarkko