On Wed, Jul 4, 2018 at 2:40 PM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > Changes since v4 [1]: > * Rework dax_lock_page() to reuse get_unlocked_mapping_entry() (Jan) > > * Change the calling convention to take a 'struct page *' and return > success / failure instead of performing the pfn_to_page() internal to > the api (Jan, Ross). > > * Rename dax_lock_page() to dax_lock_mapping_entry() (Jan) > > * Account for the case that a given pfn can be fsdax mapped with > different sizes in different vmas (Jan) > > * Update collect_procs() to determine the mapping size of the pfn for > each page given it can be variable in the dax case. > > [1]: https://lists.01.org/pipermail/linux-nvdimm/2018-June/016279.html > > --- > > As it stands, memory_failure() gets thoroughly confused by dev_pagemap > backed mappings. The recovery code has specific enabling for several > possible page states and needs new enabling to handle poison in dax > mappings. > > In order to support reliable reverse mapping of user space addresses: > > 1/ Add new locking in the memory_failure() rmap path to prevent races > that would typically be handled by the page lock. > > 2/ Since dev_pagemap pages are hidden from the page allocator and the > "compound page" accounting machinery, add a mechanism to determine the > size of the mapping that encompasses a given poisoned pfn. > > 3/ Given pmem errors can be repaired, change the speculatively accessed > poison protection, mce_unmap_kpfn(), to be reversible and otherwise > allow ongoing access from the kernel. > > A side effect of this enabling is that MADV_HWPOISON becomes usable for > dax mappings, however the primary motivation is to allow the system to > survive userspace consumption of hardware-poison via dax. Specifically > the current behavior is: > > mce: Uncorrected hardware memory error in user-access at af34214200 > {1}[Hardware Error]: It has been corrected by h/w and requires no further action > mce: [Hardware Error]: Machine check events logged > {1}[Hardware Error]: event severity: corrected > Memory failure: 0xaf34214: reserved kernel page still referenced by 1 users > [..] > Memory failure: 0xaf34214: recovery action for reserved kernel page: Failed > mce: Memory error not recovered > <reboot> > > ...and with these changes: > > Injecting memory failure for pfn 0x20cb00 at process virtual address 0x7f763dd00000 > Memory failure: 0x20cb00: Killing dax-pmd:5421 due to hardware memory corruption > Memory failure: 0x20cb00: recovery action for dax page: Recovered > > Given all the cross dependencies I propose taking this through > nvdimm.git with acks from Naoya, x86/core, x86/RAS, and of course dax > folks. > Hi, Any comments on this series? Matthew is patiently waiting to rebase some of his Xarray work until the dax_lock_mapping_entry() changes hit -next.