On Thu 2023-03-23 17:09:13, Petr Mladek wrote: > On Wed 2023-03-22 18:41:40, Alexey Dobriyan wrote: > > Hi, Josh. > > > > I've been profiling how much time livepatching takes and I have a question > > regarding these lines: > > > > void klp_try_complete_transition(void) > > { > > ... > > if (!complete) { > > schedule_delayed_work(&klp_transition_work, round_jiffies_relative(HZ)); > > return; > > } > > > > } > > > > The problem here is that if the system is idle, then the previous loop > > checking idle tasks will reliably sets "complete = false" and then > > patching wastes time waiting for next second so that klp_transition_work > > will repeat the same code without reentering itself. > > > > I've created trivial patch which changes 2 functions and it takes > > ~1.3 seconds to complete transition: > > > > [ 33.829506] livepatch: 'main': starting patching transition > > [ 35.190721] livepatch: 'main': patching complete > > > > I don't know what's the correct timeout to retry transition work > > but it can be made near-instant: > > > > [ 100.930758] livepatch: 'main': starting patching transition > > [ 100.956190] livepatch: 'main': patching complete > > > > > > Alexey (CloudLinux) > > > > > > --- a/kernel/livepatch/transition.c > > +++ b/kernel/livepatch/transition.c > > @@ -435,8 +435,7 @@ void klp_try_complete_transition(void) > > * later and/or wait for other methods like kernel exit > > * switching. > > */ > > - schedule_delayed_work(&klp_transition_work, > > - round_jiffies_relative(HZ)); > > + schedule_delayed_work(&klp_transition_work, msecs_to_jiffies(1)); > > Note that this affects all iterations, not only the first one. In > practice, it might schedule another klp_transition_work() almost > immediately. Servers typically use HZ = 250 and 1ms is less than 1 jiffy. > > Now, klp_try_complete_transition() takes tasklist_lock and blocks > forking and exiting. It sounds scary to do this so frequently. > Is am actually surprised that nobody reported any problems > even with the 1sec period. > > Please, stop these micro-optimizations! > > Livepatching is about security and reliability. It is not about speed! > > The fact that the transition is so lazy is a great feature. > Livepatching should make the system more safe without breaking it. > It is great that it could run in background and do not limit > the normal workload too much. By other words. It might be interesting to find a way how to migrate the idle tasks during the first klp_try_complete_transition() call. As a result the 2nd call would not be needed and it would make livepatching even more lightweight. But just increasing the frequency of klp_try_complete_transition() is not the way to go. Best Regards, Petr