On Mon, Feb 21, 2022 at 07:48:20PM -0800, Andrew Morton wrote: > The patch titled > Subject: mm/hwpoison: avoid the impact of hwpoison_filter() return value on mce handler > has been added to the -mm tree. Its filename is > mm-hwpoison-avoid-the-impact-of-hwpoison_filter-return-value-on-mce-handler.patch Andrew, please drop this patch from your queue. Review below: > From: luofei <luofei@xxxxxxxxxxxx> > Subject: mm/hwpoison: avoid the impact of hwpoison_filter() return value on mce handler > > When the hwpoison page meets the filter conditions, it should not be > regarded as successful memory_failure() processing for mce handler, but > should return a value(-EHWPOISON), otherwise mce handler regards the error > page has been identified and isolated, which may lead to calling > set_mce_nospec() to change page attribute, etc. > > Here a new MF_MCE_HANDLE flag is introduced to identify the call from the > mce handler and instruct hwpoison_filter() to return -EHWPOISON, otherwise > return 0 for compatibility with the hwpoison injector. > > Link: https://lkml.kernel.org/r/20220221021415.2328992-1-luofei@xxxxxxxxxxxx > Signed-off-by: luofei <luofei@xxxxxxxxxxxx> > Cc: Tony Luck <tony.luck@xxxxxxxxx> > Cc: Borislav Petkov <bp@xxxxxxxxx> > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > Cc: Ingo Molnar <mingo@xxxxxxxxxx> > Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> > Cc: Naoya Horiguchi <naoya.horiguchi@xxxxxxx> > Cc: H. Peter Anvin <hpa@xxxxxxxxx> > Cc: Fei Luo <luofei@xxxxxxxxxxxx> > Cc: Miaohe Lin <linmiaohe@xxxxxxxxxx> > Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > --- > > arch/x86/kernel/cpu/mce/core.c | 15 +++++++++------ > include/linux/mm.h | 1 + > mm/memory-failure.c | 14 ++++++++++++-- > 3 files changed, 22 insertions(+), 8 deletions(-) > > --- a/arch/x86/kernel/cpu/mce/core.c~mm-hwpoison-avoid-the-impact-of-hwpoison_filter-return-value-on-mce-handler > +++ a/arch/x86/kernel/cpu/mce/core.c > @@ -612,7 +612,7 @@ static int uc_decode_notifier(struct not > return NOTIFY_DONE; > > pfn = mce->addr >> PAGE_SHIFT; > - if (!memory_failure(pfn, 0)) { > + if (!memory_failure(pfn, MF_MCE_HANDLE)) { > set_mce_nospec(pfn, whole_page(mce)); > mce->kflags |= MCE_HANDLED_UC; > } > @@ -1286,7 +1286,7 @@ static void kill_me_now(struct callback_ > static void kill_me_maybe(struct callback_head *cb) > { > struct task_struct *p = container_of(cb, struct task_struct, mce_kill_me); > - int flags = MF_ACTION_REQUIRED; > + int flags = MF_ACTION_REQUIRED | MF_MCE_HANDLE; > int ret; > > p->mce_count = 0; > @@ -1303,9 +1303,12 @@ static void kill_me_maybe(struct callbac > } > > /* > - * -EHWPOISON from memory_failure() means that it already sent SIGBUS > - * to the current process with the proper error info, so no need to > - * send SIGBUS here again. > + * -EHWPOISON from memory_failure() means that memory_failure() did > + * not handle the error event for the following reason: > + * - SIGBUS has already been sent to the current process with the > + * proper error info, or > + * - hwpoison_filter() filtered the event, > + * so no need to deal with it more. So you're overloading what we do for -EHWPOISON because it is just fits? The right way to do this is to return a *distinct* error value which means "ignore this error" and the MCE code can then ignore it. > --- a/include/linux/mm.h~mm-hwpoison-avoid-the-impact-of-hwpoison_filter-return-value-on-mce-handler > +++ a/include/linux/mm.h > @@ -3173,6 +3173,7 @@ enum mf_flags { > MF_MUST_KILL = 1 << 2, > MF_SOFT_OFFLINE = 1 << 3, > MF_UNPOISON = 1 << 4, > + MF_MCE_HANDLE = 1 << 5, This thing is too x86-specific and means nothing for other arches which use memory_failure(). And I don't like this jumping through hoops one bit: MCE code sets MF_MCE_HANDLE to tell the memory failure code to return a special error which it does and then the MCE code again looks looks at that special error. That's just nuts. As said above, all you wanna do is have memory_failure() return a distinct error value which says "ignore this error and don't do any further processing". Without a new MF flag. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette