On Mon 2022-05-09 14:13:17, Rik van Riel wrote: > On Mon, 2022-05-09 at 11:38 +0200, Peter Zijlstra wrote: > > On Mon, May 09, 2022 at 08:06:22AM +0000, Song Liu wrote: > > > > > > > > > > On May 9, 2022, at 12:04 AM, Peter Zijlstra > > > > <peterz@xxxxxxxxxxxxx> wrote: > > > > > > > > On Sat, May 07, 2022 at 10:46:28AM -0700, Song Liu wrote: > > > > > Busy kernel threads may block the transition of livepatch. Call > > > > > klp_try_switch_task from __cond_resched to make the transition > > > > > easier. > > > > > > > > What will a PREEMPT=y kernel do? How is it not a problem there, > > > > and if > > > > it is, this will not help that. > > > > Not really. There is no difference between an explicit preemption > > point > > (cond_resched) or an involuntary preemption point (PREEMPT=y). > > > > So unless you can *exactly* say why it isn't a problem on PREEMPT=y, > > none of this makes any sense. > > I suspect it is a problem on PREEMPT=y too, but is there some sort > of fairly light weight (in terms of stuff we need to add to the kernel) > solution that could solve both? > > Do we have some real time per-CPU kernel threads we could just > issue a NOOP call to, which would preempt long-running kernel > threads (like a kworker with oodles of work to do)? > > Could the stopper workqueue be a suitable tool for this job? An interesting solution would be to queue irq_work in CPU that is occupied by the long-running kernel task. It might be queued from klp_try_complete_transition() that is called from the regular klp_transition_work_fn(). Then the task might try to migrate itself from the irq_work. But the problem is that stack_trace_save_tsk_reliable() probably will not be able to store a reliable backtrace for the interrupted task. So, we might really need to stop the task (CPU). But there still might be problem if stack_trace_save_tsk_reliable() will consider the stack as reliable. Best Regards, Petr