Re: [RFC] sched,livepatch: call klp_try_switch_task in __cond_resched

Josh Poimboeuf <jpoimboe@xxxxxxxxxx> · Wed, 11 May 2022 21:07:54 -0700

On Wed, May 11, 2022 at 04:33:57PM +0000, Song Liu wrote:
> >> Ideally we'd have the ORC unwinder for all arches, that would make this
> >> much easier.  But we're not there yet.
> > 
> > The alternative solution is that the process has to migrate itself
> > on some safe location.
> > 
> > One crazy idea. It still might be possible to find the called
> > functions on the stack even when it is not reliable. Then it
> > might be possible to add another ftrace handler on
> > these found functions. This other ftrace handler might migrate
> > the task when it calls this function again.
> > 
> > It assumes that the task will call the same functions again
> > and again. Also it might require that the tasks checks its
> > own stack from the ftrace handler. I am not sure if this
> > is possible.
> > 
> > There might be other variants of this approach.
> 
> This might be the ultimate solution! As ftrace allows filtering based
> on pid (/sys/kernel/tracing/set_ftrace_pid), we can technically trigger
> klp_try_switch_task() on every function of the pending tasks. If this 
> works, we should finish most of the transition in seconds. And the only
> failure there would be threads with being patched function at the very 
> sottom of its stack. Am I too optimistic here? 

It's a crazy idea, but I kind of like it ;-)  Especially this variant of
tracing all functions for the task.  We'd have to make sure unwinding
from an ftrace handler works for all arches/unwinders.

-- 
Josh