On Tue, Jun 06, 2017 at 01:07:40PM +0200, Miroslav Benes wrote: > On Tue, 23 May 2017, Petr Mladek wrote: > > > rcu_read_(un)lock(), list_*_rcu(), and synchronize_rcu() are used for > > a secure access and manipulation of the list of patches that modify > > the same function. In particular, it is the variable func_stack that > > is accessible from the ftrace handler via struct ftrace_ops and klp_ops. > > > > Of course, it synchronizes also some states of the patch on the top > > of the stack, e.g. func->transition in klp_ftrace_handler. > > > > At the same time, this mechanism guards also the manipulation > > of task->patch_state. It is modified according to the state of > > the transition and the state of the process. > > > > Now, all this works well as long as RCU works well. Sadly livepatching > > might get into some corner cases when this is not true. For example, > > RCU is not watching when rcu_read_lock() is taken in idle threads. > > It is because they might sleep and prevent reaching the grace period > > for too long. > > > > There are ways how to make RCU watching even in idle threads, > > see rcu_irq_enter(). But there is a small location inside RCU > > infrastructure when even this does not work. > > > > This small problematic location can be detected either before > > calling rcu_irq_enter() by rcu_irq_enter_disabled() or later by > > rcu_is_watching(). Sadly, there is no safe way how to handle it. > > Once we detect that RCU was not watching, we might see inconsistent > > state of the function stack and the related variables in > > klp_ftrace_handler(). Then we could do a wrong decision, > > use an incompatible implementation of the function and > > break the consistency of the system. We could warn but > > we could not avoid the damage. > > > > Fortunately, ftrace has similar problems and they seem to > > be solved well there. It uses a heavy weight implementation > > of some RCU operations. In particular, it replaces: > > > > + rcu_read_lock() with preempt_disable_notrace() > > + rcu_read_unlock() with preempt_enable_notrace() > > + synchronize_rcu() with schedule_on_each_cpu(sync_work) > > > > My understanding is that this is RCU implementation from > > a stone age. It meets the core RCU requirements but it is > > rather ineffective. Especially, it does not allow to batch > > or speed up the synchronize calls. > > > > On the other hand, it is very trivial. It allows to safely > > trace and/or livepatch even the RCU core infrastructure. > > And the effectiveness is a not a big issue because using ftrace > > or livepatches on productive systems is a rare operation. > > The safety is much more important than a negligible extra > > load. > > > > Note that the alternative implementation follows the RCU > > principles. Therefore, we could and actually must use > > list_*_rcu() variants when manipulating the func_stack. > > These functions allow to access the pointers in > > the right order and with the right barriers. But they > > do not use any other information that would be set > > only by rcu_read_lock(). > > > > Signed-off-by: Petr Mladek <pmladek@xxxxxxxx> > > I think this should work... if I understand the problem and how RCU works > correctly. > > > Second, ftrace needs to make sure that nobody is inside > > the dynamic trampoline when it is being freed. For this, it also > > calls synchronize_rcu_tasks() in preemptive kernel in > > ftrace_shutdown(). > > > > Livepatch has similar problem but it should get solved by ftrace > > for free. klp_ftrace_handler() is a good guy and newer sleeps. > > In addition, it is registered with FTRACE_OPS_FL_DYNAMIC. It causes > > that unregister_ftrace_function() calls: > > > > * schedule_on_each_cpu(ftrace_sync) - always > > * synchronize_rcu_tasks() - in preemptive kernel > > > > The effect is that nobody is neither inside the dynamic trampoline > > nor inside the ftrace handler. > > I would add something along these lines to the changelog. That we rely on > ftrace to do the right thing, because we are a good citizen. I agree, most of the text below the '---' marker should be in the changelog. Otherwise the patch looks good to me, other than the comment typos I mentioned earlier. -- Josh -- To unsubscribe from this list: send the line "unsubscribe live-patching" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html