On Wed 2016-05-04 10:51:21, Josh Poimboeuf wrote: > On Wed, May 04, 2016 at 10:42:23AM +0200, Petr Mladek wrote: > > On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote: > > > Change livepatch to use a basic per-task consistency model. This is the > > > foundation which will eventually enable us to patch those ~10% of > > > security patches which change function or data semantics. This is the > > > biggest remaining piece needed to make livepatch more generally useful. > > > > > --- a/kernel/fork.c > > > +++ b/kernel/fork.c > > > @@ -76,6 +76,7 @@ > > > #include <linux/compiler.h> > > > #include <linux/sysctl.h> > > > #include <linux/kcov.h> > > > +#include <linux/livepatch.h> > > > > > > #include <asm/pgtable.h> > > > #include <asm/pgalloc.h> > > > @@ -1586,6 +1587,8 @@ static struct task_struct *copy_process(unsigned long clone_flags, > > > p->parent_exec_id = current->self_exec_id; > > > } > > > > > > + klp_copy_process(p); > > > > I am in doubts here. We copy the state from the parent here. It means > > that the new process might still need to be converted. But at the same > > point print_context_stack_reliable() returns zero without printing > > any stack trace when TIF_FORK flag is set. It means that a freshly > > forked task might get be converted immediately. I seems that boot > > operations are always done when copy_process() is called. But > > they are contradicting each other. > > > > I guess that print_context_stack_reliable() should either return > > -EINVAL when TIF_FORK is set. Or it should try to print the > > stack of the newly forked task. > > > > Or do I miss something, please? > > Ok, I admit it's confusing. > > A newly forked task doesn't *have* a stack (other than the pt_regs frame > it needs for the return to user space), which is why > print_context_stack_reliable() returns success with an empty array of > addresses. > > For a little background, see the second switch_to() macro in > arch/x86/include/asm/switch_to.h. When a newly forked task runs for the > first time, it returns from __switch_to() with no stack. It then jumps > straight to ret_from_fork in entry_64.S, calls a few C functions, and > eventually returns to user space. So, assuming we aren't patching entry > code or the switch_to() macro in __schedule(), it should be safe to > patch the task before it does all that. This is great explanation. Thanks for it. > So, having said all that, I'm really not sure what the best approach is > for print_context_stack_reliable(). Right now I'm thinking I'll change > it back to return -EINVAL for a newly forked task, so it'll be more > future-proof: better to have a false positive than a false negative. > Either way it will probably need to be changed again if the > ret_from_fork code gets cleaned up. I would prefer the -EINVAL. It might safe some hairs when anyone is working on patching the switch_to stuff. Also it is not that big loss beacuse most tasks will get migrated on the return to userspace. It might help a bit with the newly forked kthreads. But there should be more safe location where the new kthreads might get migrated, e.g. right before the main function gets called. > > > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c > > > new file mode 100644 > > > index 0000000..92819bb > > > --- /dev/null > > > +++ b/kernel/livepatch/transition.c > > > +/* > > > + * This function can be called in the middle of an existing transition to > > > + * reverse the direction of the target patch state. This can be done to > > > + * effectively cancel an existing enable or disable operation if there are any > > > + * tasks which are stuck in the initial patch state. > > > + */ > > > +void klp_reverse_transition(void) > > > +{ > > > + struct klp_patch *patch = klp_transition_patch; > > > + > > > + klp_target_state = !klp_target_state; > > > + > > > + /* > > > + * Ensure that if another CPU goes through the syscall barrier, sees > > > + * the TIF_PATCH_PENDING writes in klp_start_transition(), and calls > > > + * klp_patch_task(), it also sees the above write to the target state. > > > + * Otherwise it can put the task in the wrong universe. > > > + */ > > > + smp_wmb(); > > > + > > > + klp_start_transition(); > > > + klp_try_complete_transition(); > > > > It is a bit strange that we keep the work scheduled. It might be > > better to use > > > > mod_delayed_work(system_wq, &klp_work, 0); > > True, I think that would be better. > > > Which triggers more ideas from the nitpicking deparment: > > > > I would move the work definition from core.c to transition.c because > > it is closely related to klp_try_complete_transition(); > > That could be good, but there's a slight problem: klp_work_fn() requires > klp_mutex, which is static to core.c. It's kind of nice to keep the use > of the mutex in core.c only. I see and am surprised that we take the lock only in core.c ;-) I do not have a strong opinion then. Just a small one. The lock guards also operations from the other .c files. I think that it is only a matter of time when we will need to access it there. But the work is clearly transition-related. But it is a real nitpicking. I am sorry for it. > > When on it. I would make it more clear that the work is related > > to transition. > > How would you recommend doing that? How about: > > - rename "klp_work" -> "klp_transition_work" > - rename "klp_work_fn" -> "klp_transition_work_fn" Yup, sounds better. > > Also I would call queue_delayed_work() directly > > instead of adding the klp_schedule_work() wrapper. The delay > > might be defined using a constant, e.g. > > > > #define KLP_TRANSITION_DELAY round_jiffies_relative(HZ) > > > > queue_delayed_work(system_wq, &klp_transition_work, KLP_TRANSITION_DELAY); > > Sure. > > > Finally, the following is always called right after > > klp_start_transition(), so I would call it from there. > > > > if (!klp_try_complete_transition()) > > klp_schedule_work(); > > Except for when it's called by klp_reverse_transition(). And it really > depends on whether we want to allow transition.c to use the mutex. I > don't have a strong opinion either way, I may need to think about it > some more. Ah, I had in mind that it could be replaced by that mod_delayed_work(system_wq, &klp_transition_work, 0); So, that we would never call klp_try_complete_transition() directly. Then it could be the same in all situations. But it might look strange and be ineffective when really starting the transition. So, maybe forget about it. Best Regards, Petr -- To unsubscribe from this list: send the line "unsubscribe live-patching" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html