On Wed, Feb 08, 2017 at 04:47:50PM +0100, Petr Mladek wrote: > > Notice in this case that klp_target_state is KLP_PATCHED. Which means > > that klp_complete_transition() would not call synchronize_rcu() at the > > right time, nor would it call module_put(). It can be fixed with: > > > > @@ -387,7 +389,7 @@ static int __klp_enable_patch(struct klp_patch *patch) > > pr_warn("failed to enable patch '%s'\n", > > patch->mod->name); > > > > - klp_unpatch_objects(patch); > > + klp_target_state = KLP_UNPATCHED; > > klp_complete_transition(); > > > > return ret; > > Great catch! Looks good to me. As it turns out, klp_target_state isn't accessible from this file, so I'll make a klp_cancel_transition() helper function which reverses klp_target_state and calls klp_complete_transition(). > > This assumes that the 'if (klp_target_state == KLP_UNPATCHED)' clause in > > klp_try_complete_transition() gets moved to klp_complete_transition() as > > you suggested. > > > > > > > > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c > > > > > > index 5efa262..1a77f05 100644 > > > > > > --- a/kernel/livepatch/patch.c > > > > > > +++ b/kernel/livepatch/patch.c > > > > > > @@ -29,6 +29,7 @@ > > > > > > #include <linux/bug.h> > > > > > > #include <linux/printk.h> > > > > > > #include "patch.h" > > > > > > +#include "transition.h" > > > > > > > > > > > > static LIST_HEAD(klp_ops); > > > > > > > > > > > > @@ -54,15 +55,58 @@ static void notrace klp_ftrace_handler(unsigned long ip, > > > > > > { > > > > > > struct klp_ops *ops; > > > > > > struct klp_func *func; > > > > > > + int patch_state; > > > > > > > > > > > > ops = container_of(fops, struct klp_ops, fops); > > > > > > > > > > > > rcu_read_lock(); > > > > > > + > > > > > > func = list_first_or_null_rcu(&ops->func_stack, struct klp_func, > > > > > > stack_node); > > > > > > + > > > > > > + /* > > > > > > + * func should never be NULL because preemption should be disabled here > > > > > > + * and unregister_ftrace_function() does the equivalent of a > > > > > > + * synchronize_sched() before the func_stack removal. > > > > > > + */ > > > > > > + if (WARN_ON_ONCE(!func)) > > > > > > + goto unlock; > > > > > > + > > > > > > + /* > > > > > > + * Enforce the order of the ops->func_stack and func->transition reads. > > > > > > + * The corresponding write barrier is in __klp_enable_patch(). > > > > > > + */ > > > > > > + smp_rmb(); > > > > > > > > > > I was curious why the comment did not mention __klp_disable_patch(). > > > > > It was related to the hours of thinking. I would like to avoid this > > > > > in the future and add a comment like. > > > > > > > > > > * This barrier probably is not needed when the patch is being > > > > > * disabled. The patch is removed from the stack in > > > > > * klp_try_complete_transition() and there we need to call > > > > > * rcu_synchronize() to prevent seeing the patch on the stack > > > > > * at all. > > > > > * > > > > > * Well, it still might be needed to see func->transition > > > > > * when the patch is removed and the task is migrated. See > > > > > * the write barrier in __klp_disable_patch(). > > > > > > > > Agreed, though as you mentioned earlier, there's also the implicit > > > > barrier in klp_update_patch_state(), which would execute first in such a > > > > scenario. So I think I'll update the barrier comments in > > > > klp_update_patch_state(). > > > > > > You inspired me to a scenario with 3 CPUs: > > > > > > CPU0 CPU1 CPU2 > > > > > > __klp_disable_patch() > > > > > > klp_init_transition() > > > > > > func->transition = true > > > > > > smp_wmb() > > > > > > klp_start_transition() > > > > > > set TIF_PATCH_PATCHPENDING > > > > > > klp_update_patch_state() > > > > > > task->patch_state > > > = KLP_UNPATCHED > > > > > > smp_mb() > > > > > > klp_ftrace_handler() > > > func = list_... > > > > > > smp_rmb() /*needed?*/ > > > > > > if (func->transition) > > > > > > > I think this isn't possible. Remember the comment I added to > > klp_update_patch_state(): > > > > * NOTE: If task is not 'current', the caller must ensure the task is inactive. > > * Otherwise klp_ftrace_handler() might read the wrong 'patch_state' value. > > > > Right now klp_update_patch_state() is only called for current. > > klp_ftrace_handler() on CPU2 would be running in the context of a > > different task. > > I agree that it is impossible with the current code. In fact, I cannot > imagine a way to migrate the task where the barrier would be needed. > The question if we could/should somehow document it. Something like > > * The barrier is not really needed when the patch is being > * disabled. The value of func->transition would change > * the result of this handled only after the task is migrated. > * But the conditions for the migration are very limited > * and practically include a full barrier, see > * klp_update_patch_state(). Well, I'd like to avoid over-commenting this issue. For v5 I've added the following comment to klp_update_patch_state() -- see #2: /* * This test_and_clear_tsk_thread_flag() call also serves as a read * barrier (smp_rmb) for two cases: * * 1) Enforce the order of the TIF_PATCH_PENDING read and the * klp_target_state read. The corresponding write barrier is in * klp_init_transition(). * * 2) Enforce the order of the TIF_PATCH_PENDING read and a future read * of func->transition, if klp_ftrace_handler() is called later on * the same CPU. See __klp_disable_patch(). */ Is that sufficient? If not, I could maybe add another related comment in klp_ftrace_handler() which refers to this comment. -- Josh -- To unsubscribe from this list: send the line "unsubscribe live-patching" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html