Hello Boris,
On 11/2/2022 6:22 AM, Borislav Petkov wrote:
On Mon, Oct 31, 2022 at 04:58:38PM -0500, Kalra, Ashish wrote:
if (snp_lookup_rmpentry(pfn, &rmp_level)) {
do_sigbus(regs, error_code, address, VM_FAULT_SIGBUS);
return RMP_PF_RETRY;
Does this issue some halfway understandable error message why the
process got killed?
Will look at adding our own recovery function for the same, but that will
again mark the pages as poisoned, right ?
Well, not poisoned but PG_offlimits or whatever the mm folks agree upon.
Semantically, it'll be handled the same way, ofc.
Added a new PG_offlimits flag and a simple corresponding handler for it.
But there is still added complexity of handling hugepages as part of
reclamation failures (both HugeTLB and transparent hugepages) and that
means calling more static functions in mm/memory_failure.c
There is probably a more appropriate handler in mm/memory-failure.c:
soft_offline_page() - this will mark the page as HWPoisoned and also has
handling for hugepages. And we can avoid adding a new page flag too.
soft_offline_page - Soft offline a page.
Soft offline a page, by migration or invalidation, without killing anything.
So, this looks like a good option to call
soft_offline_page() instead of memory_failure() in case of
failure to transition the page back to HV/shared state via
SNP_RECLAIM_CMD and/or RMPUPDATE instruction.
Thanks,
Ashish
Still waiting for some/more feedback from mm folks on the same.
Just send the patch and they'll give it.
Thx.