On Mon, Nov 2, 2020 at 11:25 AM Vikram Sethi <vsethi@xxxxxxxxxx> wrote: [..] > > > At least for passing through memory to VMs (via KVM), you don't actually > > > need struct pages / memory exposed to the buddy via > > > add_memory_driver_managed(). Actually, doing that sounds like the wrong > > > approach. > > > > > > E.g., you would "allocate" the memory via devdax/dax_hmat and directly > > > map the resulting device into guest address space. At least that's what > > > some people are doing with > > How does memory_failure forwarding to guest work in that case? > IIUC it doesn't without a struct page in the host. > For normal memory, when VM consumes poison, host kernel signals > Userspace with SIGBUS and si-code that says Action Required, which > QEMU injects to the guest. > IBM had done something like you suggest with coherent GPU memory and IIUC > memory_failure forwarding to guest VM does not work there. > > kernel https://lkml.org/lkml/2018/12/20/103 > QEMU: https://patchwork.kernel.org/patch/10831455/ > I would think we *do want* memory errors to be sent to a VM. > > > > ...and Joao is working to see if the host kernel can skip allocating > > 'struct page' or do it on demand if the guest ever requests host > > kernel services on its memory. Typically it does not so host 'struct > > page' space for devdax memory ranges goes wasted. > Is memory_failure forwarded to and handled by guest? This dovetails with one of the DAX enabling backlog items to remove dependencies on page->mapping and page->index for the memory-failure path because that also gets in the way of reflink. For devdax it's easy to drop the page->mapping dependency. For fsdax we still need something to redirect the lookup into the proper filesystem code. Certainly memory-failure support will not regress, it just means we're stuck with 'struct page' in this path in the meantime.