On Mon, Jan 27, 2025 at 9:46 PM Petr Mladek <pmladek@xxxxxxxx> wrote: > > On Mon 2025-01-27 14:35:24, Yafang Shao wrote: > > The atomic replace livepatch mechanism was introduced to handle scenarios > > where we want to unload a specific livepatch without unloading others. > > However, its current implementation has significant shortcomings, making > > it less than ideal in practice. Below are the key downsides: > > > > - It is expensive > > > > During testing with frequent replacements of an old livepatch, random RCU > > warnings were observed: > > > > [19578271.779605] rcu_tasks_wait_gp: rcu_tasks grace period 642409 is 10024 jiffies old. > > [19578390.073790] rcu_tasks_wait_gp: rcu_tasks grace period 642417 is 10185 jiffies old. > > [19578423.034065] rcu_tasks_wait_gp: rcu_tasks grace period 642421 is 10150 jiffies old. > > [19578564.144591] rcu_tasks_wait_gp: rcu_tasks grace period 642449 is 10174 jiffies old. > > [19578601.064614] rcu_tasks_wait_gp: rcu_tasks grace period 642453 is 10168 jiffies old. > > [19578663.920123] rcu_tasks_wait_gp: rcu_tasks grace period 642469 is 10167 jiffies old. > > [19578872.990496] rcu_tasks_wait_gp: rcu_tasks grace period 642529 is 10215 jiffies old. > > [19578903.190292] rcu_tasks_wait_gp: rcu_tasks grace period 642529 is 40415 jiffies old. > > [19579017.965500] rcu_tasks_wait_gp: rcu_tasks grace period 642577 is 10174 jiffies old. > > [19579033.981425] rcu_tasks_wait_gp: rcu_tasks grace period 642581 is 10143 jiffies old. > > [19579153.092599] rcu_tasks_wait_gp: rcu_tasks grace period 642625 is 10188 jiffies old. > > > > This indicates that atomic replacement can cause performance issues, > > particularly with RCU synchronization under frequent use. > > Please, provide more details about the test: > > + List of patched functions. $ ls /sys/kernel/livepatch/livepatch_61_release12/vmlinux/ blk_throtl_cancel_bios,1 __d_move,1 do_iter_readv_writev,1 patched update_rq_clock,1 blk_mq_handle_expired,1 d_delete,1 do_exit,1 high_work_func,1 try_charge_memcg,1 $ ls /sys/kernel/livepatch/livepatch_61_release12/xfs/ patched xfs_extent_busy_flush,1 xfs_iget,1 xfs_inode_mark_reclaimable,1 $ ls /sys/kernel/livepatch/livepatch_61_release12/fuse/ fuse_send_init,1 patched process_init_reply,1 $ ls /sys/kernel/livepatch/livepatch_61_release12/nf_tables/ nft_data_init,1 patched > > + What exactly is meant by frequent replacements (busy loop?, once a minute?) The script: #!/bin/bash while true; do yum install -y ./kernel-livepatch-6.1.12-0.x86_64.rpm ./apply_livepatch_61.sh # it will sleep 5s yum erase -y kernel-livepatch-6.1.12-0.x86_64 yum install -y ./kernel-livepatch-6.1.6-0.x86_64.rpm ./apply_livepatch_61.sh # it will sleep 5s done > > + Size of the systems (number of CPUs, number of running processes) top - 22:08:17 up 227 days, 3:29, 1 user, load average: 50.66, 32.92, 20.77 Tasks: 1349 total, 2 running, 543 sleeping, 0 stopped, 0 zombie %Cpu(s): 7.4 us, 0.9 sy, 0.0 ni, 88.1 id, 2.9 wa, 0.3 hi, 0.4 si, 0.0 st KiB Mem : 39406304+total, 8899864 free, 43732568 used, 34143062+buff/cache KiB Swap: 0 total, 0 free, 0 used. 32485065+avail Mem Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 2 Core(s) per socket: 64 Socket(s): 1 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 25 Model: 1 Model name: AMD EPYC 7Q83 64-Core Processor Stepping: 1 CPU MHz: 3234.573 CPU max MHz: 3854.4431 CPU min MHz: 1500.0000 BogoMIPS: 4890.66 Virtualization: AMD-V L1d cache: 32K L1i cache: 32K L2 cache: 512K L3 cache: 32768K NUMA node0 CPU(s): 0-127 > > + Were there any extra changes in the livepatch code code, > e.g. debugging messages? Not at all. > > > > - Potential Risks During Replacement > > > > One known issue involves replacing livepatched versions of critical > > functions such as do_exit(). During the replacement process, a panic > > might occur, as highlighted in [0]. > > I would rather prefer to make the atomic replace safe. I mean to > block the transition until all exiting processes are not gone. > > Josh made a good point. We should look how this is handled by > RCU, tracing, or another subsystems which might have similar > problems. I don't against this fix. > > > > Other potential risks may also arise > > due to inconsistencies or race conditions during transitions. > > What inconsistencies and race conditions you have in mind, please? I have explained it at https://lore.kernel.org/live-patching/Z5DHQG4geRsuIflc@xxxxxxxxxxxxxxx/T/#m5058583fa64d95ef7ac9525a6a8af8ca865bf354 klp_ftrace_handler if (unlikely(func->transition)) { WARN_ON_ONCE(patch_state == KLP_UNDEFINED); } Why is WARN_ON_ONCE() placed here? What issues have we encountered in the past that led to the decision to add this warning? > > > > - Temporary Loss of Patching > > > > During the replacement process, the old patch is set to a NOP (no-operation) > > before the new patch is fully applied. This creates a window where the > > function temporarily reverts to its original, unpatched state. If the old > > patch fixed a critical issue (e.g., one that prevented a system panic), the > > system could become vulnerable to that issue during the transition. > > This is not true! > > Please, look where klp_patch_object() and klp_unpatch_objects() is > called. Also look at how ops->func_stack is handled in > klp_ftrace_handler(). > > Also you might want to read Documentation/livepatch/livepatch.rst Thanks for your guidance. > > > > The current atomic replacement approach replaces all old livepatches, > > even when such a sweeping change is unnecessary. This can be improved > > by introducing a hybrid mode, which allows the coexistence of both > > atomic replace and non atomic replace livepatches. > > > > In the hybrid mode: > > > > - Specific livepatches can be marked as "non-replaceable" to ensure they > > remain active and unaffected during replacements. > > > > - Other livepatches can be marked as "replaceable", allowing targeted > > replacements of only those patches. > > Honestly, I consider this as a workaround for a problem which should > be fixed a proper way. This is not a workaround. We should avoid replacing functions that do not differ between the two versions. Since automatic handling is not possible for now, we can provide a sysfs interface to allow users to perform it manually. > > The main advantage of the atomic replace is simplify the maintenance > and debugging. Is it worth the high overhead on production servers? Can you provide examples of companies that use atomic replacement at scale in their production environments? > It reduces the amount of possible combinations. The > hybrid mode brings back the jungle. -- Regards Yafang