On 5/5/2024 12:02 AM, Miaohe Lin wrote:
On 2024/5/2 7:24, Jane Chu wrote:
The soft hwpoison injector via madvise(MADV_HWPOISON) operates in
a synchrous way in a sense, the injector is also a process under
test, and should it have the poisoned page mapped in its address
space, it should legitimately get killed as much as in a real UE
situation.
Will it be better to add a method to set MF_ACTION_REQUIRED explicitly when inject soft hwpoison?
Thanks.
So the first question is: Is there a need to preserve the existing
behavior of madvise(MADV_HWPOISON)?
The madvise(2) man page says -
*MADV_HWPOISON *(since Linux 2.6.32)
Poison the pages in the range specified by/addr/ and/length/
and handle subsequent references to those pages like a
hardware memory corruption. This operation is available
only for privileged (*CAP_SYS_ADMIN*) processes. This
operation may result in the calling process receiving a
*SIGBUS *and the page being unmapped.
This feature is intended for testing of memory error-
handling code; it is available only if the kernel was
configured with*CONFIG_MEMORY_FAILURE*.
And the impression from my reading is that: there doesn't seem to be a need.
A couple observations -
- The man page states that the calling process may receive a SIGBUS and the page being unmapped.
But the existing behavior is no SIGBUS unless MCE early kill is elected, so it doesn't quite match
the man page.
- There is 'hwpoison-inject' which behaves similar to the existing madvise(MADV_HWPOISON), that is,
soft inject without MF_ACTION_REQUIRED flag.
thanks,
-jane
.
Signed-off-by: Jane Chu <jane.chu@xxxxxxxxxx>
---
mm/madvise.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/madvise.c b/mm/madvise.c
index 1a073fcc4c0c..eaeae5252c02 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -1127,7 +1127,7 @@ static int madvise_inject_error(int behavior,
} else {
pr_info("Injecting memory failure for pfn %#lx at process virtual address %#lx\n",
pfn, start);
- ret = memory_failure(pfn, MF_COUNT_INCREASED | MF_SW_SIMULATED);
+ ret = memory_failure(pfn, MF_ACTION_REQUIRED | MF_COUNT_INCREASED | MF_SW_SIMULATED);
if (ret == -EOPNOTSUPP)
ret = 0;
}