在 2025/2/19 18:40, Peter Zijlstra 写道:
On Tue, Feb 18, 2025 at 05:48:00PM +0100, Borislav Petkov wrote:
On Tue, Feb 18, 2025 at 03:15:35PM +0100, Peter Zijlstra wrote:
diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c
index dac4d64dfb2a..cfdae25eacd7 100644
--- a/arch/x86/kernel/cpu/mce/severity.c
+++ b/arch/x86/kernel/cpu/mce/severity.c
@@ -301,18 +301,19 @@ static noinstr int error_context(struct mce *m, struct pt_regs *regs)
instrumentation_end();
switch (fixup_type) {
- case EX_TYPE_UACCESS:
- if (!copy_user)
- return IN_KERNEL;
- m->kflags |= MCE_IN_KERNEL_COPYIN;
- fallthrough;
-
case EX_TYPE_FAULT_MCE_SAFE:
case EX_TYPE_DEFAULT_MCE_SAFE:
m->kflags |= MCE_IN_KERNEL_RECOV;
return IN_KERNEL_RECOV;
default:
+ if (copy_user) {
As said on chat, if we can make is_copy_from_user() *always* correctly detect
user access, then sure but I'm afraid EX_TYPE_UACCESS being generated at the
handful places where we do user memory access is there for a reason as it
makes it pretty explicit.
Thing is, we have copy routines that do not know if its user or not.
is_copy_from_user() must be reliable.
Anyway, if you all really want to go all funny, try the below.
Someone has to go and stick some EX_FLAG_USER on things, but I just
really don't believe that's doing to be useful. Because while you're
doing that, you should also audit if is_copy_from_user() will catch it
and if it does, you don't need the tag.
See how much tags you end up with..
Agreed, I think the key point whether the error context is in a read from user
memory. We do not care about the ex-type if we know its a MOV
reading from userspace.
is_copy_from_user() return true when both of the following two checks are
true:
- the current instruction is copy
- source address is user memory
If copy_user is true, we set
m->kflags |= MCE_IN_KERNEL_COPYIN | MCE_IN_KERNEL_RECOV;
Then do_machine_check will try fixup_exception first.
/*
* Handle an MCE which has happened in kernel space but from
* which the kernel can recover: ex_has_fault_handler() has
* already verified that the rIP at which the error happened is
* a rIP from which the kernel can recover (by jumping to
* recovery code specified in _ASM_EXTABLE_FAULT()) and the
* corresponding exception handler which would do that is the
* proper one.
*/
if (m->kflags & MCE_IN_KERNEL_RECOV) {
if (!fixup_exception(regs, X86_TRAP_MC, 0, 0))
mce_panic("Failed kernel mode recovery", &err, msg);
}
if (m->kflags & MCE_IN_KERNEL_COPYIN)
queue_task_work(&err, msg, kill_me_never);
So Peter's code is fine to me.
---
diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c
index dac4d64dfb2a..cb021058165f 100644
--- a/arch/x86/kernel/cpu/mce/severity.c
+++ b/arch/x86/kernel/cpu/mce/severity.c
@@ -300,13 +300,12 @@ static noinstr int error_context(struct mce *m, struct pt_regs *regs)
copy_user = is_copy_from_user(regs);
instrumentation_end();
- switch (fixup_type) {
- case EX_TYPE_UACCESS:
- if (!copy_user)
- return IN_KERNEL;
- m->kflags |= MCE_IN_KERNEL_COPYIN;
- fallthrough;
+ if (copy_user) {
+ m->kflags |= MCE_IN_KERNEL_COPYIN | MCE_IN_KERNEL_COPYIN;
+ return IN_KERNEL_RECOV
+ }
+ switch (fixup_type) {
case EX_TYPE_FAULT_MCE_SAFE:
case EX_TYPE_DEFAULT_MCE_SAFE:
m->kflags |= MCE_IN_KERNEL_RECOV;
Is that ok? Please correct me if I missed anyting.
Thanks.
Shuai