On Wed, Feb 15, 2023 at 02:30:36PM +0100, Petr Mladek wrote: > > static inline int _cond_resched(void) > > { > > + klp_sched_try_switch(); > > return __cond_resched(); > > My only concern is if it might cause any performance problems. > > On one hand, cond_resched() is used in code paths that are slow > on its own. Also it will do nothing most of the time. > > On the other hand, cond_resched() is typically used in cycles. > One cycle might be fast. The code might be slow because there > are too many cycles. Repeating the same failing test might > prolong the time significantly. Yes, but it should hopefully be very rare to patch a function in the call stack of a kthread loop. In general it's a good idea for the patch author to avoid that. > An idea is to try the switch only when it was not done during > a real schedule. Something like: > > static inline int _cond_resched(void) > { > int scheduled; > > scheduled = __cond_resched(); > if (scheduled) > klp_sched_try_switch(); > > return scheduled(); > } > > But it would make it less reliable/predictable. Also it won't work > in configurations when cond_resched() is always a nop. > > I am probably too careful. We might keep it simple until any real > life problems are reported. If we can get away with it, I much prefer the simple unconditional klp_sched_try_switch() because of the predictability and quickness with which the kthread gets patched. > > --- a/kernel/livepatch/transition.c > > +++ b/kernel/livepatch/transition.c > > @@ -76,6 +96,8 @@ static void klp_complete_transition(void) > > klp_transition_patch->mod->name, > > klp_target_state == KLP_PATCHED ? "patching" : "unpatching"); > > > > + klp_cond_resched_disable(); > > + > > Nit: Strictly speaking, this is not needed when klp_complete_transition() > is called from klp_cancel_transition(). In this case, > klp_cond_resched_enable() was not called. So it might be moved into > klp_try_complete_transition(). Argh, I always forget about that pesky klp_cancel_transition(). > More important thing, thinking loudly: > > We need to make sure that no task is in the middle > klp_cond_resched_disable() when we modify anything that is used there. > > We seem to be on the safe side in klp_complete_transition(). We are > here only when all tasks have TIF_PATCH_PENDING cleared. In this case, > __klp_sched_try_switch() just returns. Also it calls > klp_synchronize_transition() so that all tasks finish the critical part > in __klp_sched_try_switch() before any new transition starts. > > But it is not the case in klp_reverse_transition(). It modifies > klp_target_state() when __klp_sched_try_switch might be in the middle > of klp_check_stack() and it might give wrong result. > > klp_reverse_transition() already solves similar race with > klp_update_patch_state() by clearing all TIF_PATCH_PENDING flags > and calling klp_synchronize_transition(). We just need to do > it earlier. Something like: Yes! Thanks, I can always count on you to find the race conditions ;-) This highlights the similarities between klp_target_state(current) and __klp_sched_try_switch(), they both access TIF_PATCH_PENDING out-of-band. Also, I'll update the comment in klp_copy_process(). It should be safe for with __klp_sched_try_switch() for the same reason as klp_update_patch_state(current): they all only work on 'current'. -- Josh