On Wed 2021-07-07 14:49:41, Vasily Gorbik wrote: > That's just a racy hack for now for demonstration purposes. > > On a s390 system with large amount of cpus > klp_try_complete_transition() often cannot be "complete" from the first > attempt. klp_try_complete_transition() schedules itself as delayed work > after a second delay. This accumulates to significant amount of time when > there are large number of livepatching transitions. > > This patch tries to minimize this delay to counting processes which still > need to be transitioned and then scheduling > klp_try_complete_transition() right away. > > For s390 LPAR with 128 cpu this reduces livepatch kselftest run time > from > real 1m11.837s > user 0m0.603s > sys 0m10.940s > > to > real 0m14.550s > user 0m0.420s > sys 0m5.779s > > And qa_test_klp run time from > real 5m15.950s > user 0m34.447s > sys 15m11.345s > > to > real 3m51.987s > user 0m27.074s > sys 9m37.301s > > Would smth like that be useful for production use cases? > Any ideas how to approach that more gracefully? Honestly, I do not see a real life use case for this, except maybe speeding up a test suite. The livepatch transition is more about reliability than about speed. In the real life, a livepatch will be applied only once in a while. We have spent weeks thinking about and discussing the consistency model, code, and barriers to handle races correctly. Especially, klp_update_patch_state() is a super-sensitive beast because it is called without klp_lock. It might be pretty hard to synchronize it with klp_reverse_transition() or klp_force_transition(). You would need to come up with a really convincing use case and numbers to make it worth the effort. Best Regards, Petr