On Fri, Apr 22, 2022 at 4:25 PM Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > > [ Add Tony as the originator of the whole_page() logic and Jue who > reported the issue that lead to 17fae1294ad9 x86/{mce,mm}: Unmap the > entire page if the whole page is affected and poisoned ] > > > On Fri, Apr 22, 2022 at 3:46 PM Jane Chu <jane.chu@xxxxxxxxxx> wrote: > > > > The set_memory_uc() approach doesn't work well in all cases. > > As Dan pointed out when "The VMM unmapped the bad page from > > guest physical space and passed the machine check to the guest." > > "The guest gets virtual #MC on an access to that page. When > > the guest tries to do set_memory_uc() and instructs cpa_flush() > > to do clean caches that results in taking another fault / exception > > perhaps because the VMM unmapped the page from the guest." > > > > Since the driver has special knowledge to handle NP or UC, > > mark the poisoned page with NP and let driver handle it when > > it comes down to repair. > > > > Please refer to discussions here for more details. > > https://lore.kernel.org/all/CAPcyv4hrXPb1tASBZUg-GgdVs0OOFKXMXLiHmktg_kFi7YBMyQ@xxxxxxxxxxxxxx/ > > > > Now since poisoned page is marked as not-present, in order to > > avoid writing to a not-present page and trigger kernel Oops, > > also fix pmem_do_write(). > > > > Fixes: 284ce4011ba6 ("x86/memory_failure: Introduce {set, clear}_mce_nospec()") > > Reviewed-by: Christoph Hellwig <hch@xxxxxx> > > Reviewed-by: Dan Williams <dan.j.williams@xxxxxxxxx> > > Signed-off-by: Jane Chu <jane.chu@xxxxxxxxxx> Boris, This is the last patch in this set that needs an x86 maintainer ack. Since you have been involved in the history for most of this, mind giving it an ack so I can pull it in for v5.19? Let me know if you want a resend.