On Thu, Apr 27, 2023 at 09:06:46AM +0800, Kefeng Wang wrote: > > > On 2023/4/26 23:45, Luck, Tony wrote: > > > > > Thanks for your confirm, and what your option about add > > > > > MCE_IN_KERNEL_COPYIN to EX_TYPE_DEFAULT_MCE_SAFE/FAULT_MCE_SAFE type > > > > > to let do_machine_check call queue_task_work(&m, msg, kill_me_never), > > > > > which kill every call memory_failure_queue() after mc safe copy return? > > > > > > > > I haven't been following this thread closely. Can you give a link to the e-mail > > > > where you posted a patch that does this? Or just repost that patch if easier. > > > > > > The major diff changes is [1], I will post a formal patch when -rc1 out, > > > thanks. > > > > > > [1] > > > https://lore.kernel.org/linux-mm/6dc1b117-020e-be9e-7e5e-a349ffb7d00a@xxxxxxxxxx/ > > > > There seem to be a few misconceptions in that message. Not sure if all of them > > were resolved. Here are some pertinent points: > > > > > > > In my understanding, an MCE should not be triggered when MC-safe copy > > > > > tries > > > > > to access to a memory error. So I feel that we might be talking about > > > > > different scenarios. > > > > This is wrong. There is still a machine check when a MC-safe copy does a read > > from a location that has a memory error. Yes, the above was my first impression to be proven wrong ;) > > > > The recovery flow in this case does not involve queue_task_work(). That is only > > useful for machine check exceptions taken in user context. The queued work will > > be executed to call memory_failure() from the kernel, but in process context (not > > from the machine check exception stack) to handle the error. > > > > For machine checks taken by kernel code (MC-safe copy functions) the recovery > > path is here: > > > > if (m.kflags & MCE_IN_KERNEL_RECOV) { > > if (!fixup_exception(regs, X86_TRAP_MC, 0, 0)) > > mce_panic("Failed kernel mode recovery", &m, msg); > > } > > > > if (m.kflags & MCE_IN_KERNEL_COPYIN) > > queue_task_work(&m, msg, kill_me_never); > > > > The "fixup_exception()" ensures that on return from the machine check handler > > code returns to the extable[] fixup location instead of the instruction that was > > loading from the memory error location. > > > > When the exception was from one of the copy_from_user() variants it makes > > sense to also do the queue_task_work() because the kernel is going to return > > to the user context (with an EFAULT error code from whatever system call was > > attempting the copy_from_user()). > > > > But in the core dump case there is no return to user. The process is being > > terminated by the signal that leads to this core dump. So even though you > > may consider the page being accessed to be a "user" page, you can't fix > > it by queueing work to run on return to user. > > For coredump,the task work will be called too, see following code, > > get_signal > sig_kernel_coredump > elf_core_dump > dump_user_range > _copy_from_iter // with MC-safe copy, return without panic > do_group_exit(ksig->info.si_signo); > do_exit > exit_task_work > task_work_run > kill_me_never > memory_failure > > I also add debug print to check the memory_failure() processing after > add MCE_IN_KERNEL_COPYIN to MCE_SAFE exception type, also tested CoW of > normal page and huge page, it works too. Sounds nice to me. Maybe this information is worth documenting in the patch description. Thanks, Naoya Horiguchi