On Mon 2025-01-27 14:35:26, Yafang Shao wrote: > The atomic replace livepatch mechanism was introduced to handle scenarios > where we want to unload a specific livepatch without unloading others. > However, its current implementation has significant shortcomings, making > it less than ideal in practice. Below are the key downsides: [...] > In the hybrid mode: > > - Specific livepatches can be marked as "non-replaceable" to ensure they > remain active and unaffected during replacements. > > - Other livepatches can be marked as "replaceable," allowing targeted > replacements of only those patches. > > This selective approach would reduce unnecessary transitions, lower the > risk of temporary patch loss, and mitigate performance issues during > livepatch replacement. > > --- a/kernel/livepatch/core.c > +++ b/kernel/livepatch/core.c > @@ -658,6 +658,8 @@ static int klp_add_nops(struct klp_patch *patch) > klp_for_each_object(old_patch, old_obj) { > int err; > > + if (!old_patch->replaceable) > + continue; This is one example where things might get very complicated. The same function might be livepatched by more livepatches, see ops->func_stack. For example, let's have funcA and three livepatches: a + lp1: .replace = false, .non-replace = true, .func = { .old_name = "funcA", .new_func = lp1_funcA, }, { } + lp2: .replace = false, .non-replace = false, .func = { .old_name = "funcA", .new_func = lp2_funcA, },{ .old_name = "funcB", .new_func = lp2_funcB, }, { } + lp3: .replace = true, .non-replace = false, .func = { .old_name = "funcB", .new_func = lp3_funcB, }, { } Now, apply lp1: + funcA() gets redirected to lp1_funcA() Then, apply lp2 + funcA() gets redirected to lp2_funcA() Finally, apply lp3: + The proposed code would add "nop()" for funcA() because it exists in lp2 and does not exist in lp3. + funcA() will get redirected to the original code because of the nop() during transition + nop() will get removed in klp_complete_transition() and funcA() will get suddenly redirected to lp1_funcA() because it will still be on ops->func_stack even after the "nop" and lp2_funcA() gets removed. => The entire system will start using another funcA implementation at some random time => this would violate the consistency model The proper solution might be tricky: 1. We would need to detect this situation and do _not_ add the "nop" for lp3 and funcA(). 2. We would need a more complicate code for handling the task states. klp_update_patch_state() sets task->patch_state using the global "klp_target_state". But in the above example, when enabling lp3: + funcA would need to get transitioned _backward_: KLP_TRANSITION_PATCHED -> KLP_TRANSITION_UNPATCHED , so that it goes on ops->func_stack: lp2_funcA -> lp1->funcA while: + funcA would need to get transitioned forward: KLP_TRANSITION_UNPATCHED -> KLP_TRANSITION_PATCHED , so that it goes on ops->func_stack: lp2_funcB -> lp3->funcB => the hybrid mode would complicate the life for both livepatch creators/maintainers and kernel code developers/maintainers. I am afraid that this complexity is not acceptable if there are better solutions for the original problem. > err = klp_add_object_nops(patch, old_obj); > if (err) > return err; I am sorry but I am quite strongly against this approach! Best Regards, Petr