On Fri, Oct 21, 2022 at 09:52:01AM +0800, Shuai Xue wrote: > > > 在 2022/10/21 AM4:05, Tony Luck 写道: > > On Thu, Oct 20, 2022 at 09:57:04AM +0800, Shuai Xue wrote: > >> > >> > >> 在 2022/10/20 AM1:08, Tony Luck 写道: > > I'm experimenting with using sched_work() to handle the call to > > memory_failure() (echoing what the machine check handler does using > > task_work)_add() to avoid the same problem of not being able to directly > > call memory_failure()). > > Work queues permit work to be deferred outside of the interrupt context > into the kernel process context. If we return to user-space before the > queued memory_failure() work is processed, we will take the fault again, > as we discussed recently. > > commit 7f17b4a121d0d ACPI: APEI: Kick the memory_failure() queue for synchronous errors > commit 415fed694fe11 ACPI: APEI: do not add task_work to kernel thread to avoid memory leak > > So, in my opinion, we should add memory failure as a task work, like > do_machine_check does, e.g. > > queue_task_work(&m, msg, kill_me_maybe); Maybe ... but this case isn't pending back to a user instruction that is trying to READ the poison memory address. The task is just trying to WRITE to any address within the page. So this is much more like a patrol scrub error found asynchronously by the memory controller (in this case found asynchronously by the Linux page copy function). So I don't feel that it's really the responsibility of the current task. When we do return to user mode the task is going to be busy servicing a SIGBUS ... so shouldn't try to touch the poison page before the memory_failure() called by the worker thread cleans things up. > > + INIT_WORK(&p->work, do_sched_memory_failure); > > + p->pfn = pfn; > > + schedule_work(&p->work); > > +} > > I think there is already a function to do such work in mm/memory-failure.c. > > void memory_failure_queue(unsigned long pfn, int flags) Also pointed out by Miaohe Lin <linmiaohe@xxxxxxxxxx> ... this does exacly what I want, and is working well in tests so far. So perhaps a cleaner solution than making the kill_me_maybe() function globally visible. -Tony