On Thu, Sep 21, 2023, isaku.yamahata@xxxxxxxxx wrote: > From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx> > > This test case implements fault injection into guest memory by > madvise(MADV_HWPOISON) for shared(conventional) memory region and > KVM_GUEST_MEMORY_FAILURE for private gmem region. Once page is poisoned, > free the poisoned page and try to run vcpu again to see a new zero page is > assigned. Thanks much for the test! I think for the initial merge it makes sense to leave this out, mainly because I don't think we want a KVM specific ioctl(). But I'll definitely keep this around to do manual point testing. > +#define BASE_DATA_SLOT 10 > +#define BASE_DATA_GPA ((uint64_t)(1ull << 32)) > +#define PER_CPU_DATA_SIZE ((uint64_t)(SZ_2M)) > + > +enum ucall_syncs { > + HWPOISON_SHARED, > + HWPOISON_PRIVATE, > +}; > + > +static void guest_sync_shared(uint64_t gpa) Probably guest_poison_{shared,private}(), or maybe just open code the GUEST_SYNC2() calls. I added helpers in the other tests because the ucalls were a bit more involved then passing the GPA. However, I don't see any reason to do hypercalls and on-demand mapping/fallocate. Just have two separate sub-tests, one for private and one for shared, each with its own host. I'm pretty sure the guest code can be the same, e.g. I believe it would just boil down to: static void guest_code(uint64_t gpa) { uint64_t *addr = (void *)gpa; WRITE_ONCE(*addr, <some pattern>); /* Ask the host to poison the page. */ GUEST_SYNC(EWPOISON); /* * Access the poisoned page. The host should see a SIGBUS or EHWPOISON * and then truncate the page. After truncation, the page should be * faulted back and read zeros, all before the read completes. */ GUEST_ASSERT_EQ(*(uint64_t *)gpa, 0); GUEST_DONE(); } > + if (uc.args[0] == HWPOISON_PRIVATE) { > + int ret; > + > + inject_memory_failure(gmem_fd, gpa); > + ret = _vcpu_run(vcpu); > + TEST_ASSERT(ret == -1 && errno == EHWPOISON && Honestly, I'm kinda surprised the KVM code actually works :-) > + run->exit_reason == KVM_EXIT_MEMORY_FAULT, > + "exit_reason 0x%x", > + run->exit_reason); > + /* Discard the poisoned page and assign new page. */ > + vm_guest_mem_fallocate(vm, gpa, PAGE_SIZE, true); > + } else { > + uint8_t *hva = addr_gpa2hva(vm, gpa); > + int r; > + > + r = madvise(hva, 8, MADV_HWPOISON); Huh. TIL there's an MADV_HWPOISON. We've already talked about adding fbind(), adding an fadvise() seems like the obvious solution. Or maybe overload fallocate() with a new flag? Regardless, I think we should add or extend a generic fd-based syscall(), not throw in something KVM specific.