> > <snip> > > > Testing > > > ======= > > > > > > - I have run all of the livepatch selftests successfully. I have written a > > > couple of extra selftests myself which I will be posting separately > > Hi, > > > > What test configuration/environment you are using for test? > > When I tried kselftest with fedora based config on VM, I got errors > > because livepatch transition won't finish until signal is sent > > (i.e. it takes 15s for every transition). > > > > [excerpt from test result] > > ``` > > $ sudo ./test-livepatch.sh > > TEST: basic function patching ... not ok > > > > --- expected > > +++ result > > @@ -2,11 +2,13 @@ > > livepatch: enabling patch 'test_klp_livepatch' > > livepatch: 'test_klp_livepatch': initializing patching transition > > livepatch: 'test_klp_livepatch': starting patching transition > > +livepatch: signaling remaining tasks > > livepatch: 'test_klp_livepatch': completing patching transition > > ``` > > It might be interesting to see what process is blocking the > transition. The transition state is visible in > /proc/<pid>/patch_state. > > The transition is blocked when a process is in KLP_UNPATCHED state. > It is defined in include/linux/livepatch.h: > > #define KLP_UNPATCHED 0 > > Well, the timing against the transition is important. The following > might help to see the blocking processes: > > $> modprobe livepatch-sample ; \ > sleep 1; \ > for proc_path in \ > `grep "\-1" /proc/*/patch_state | cut -d '/' -f-3` ; \ > do \ > cat $proc_path/comm ; \ > cat $proc_path/stack ; \ > echo === ; \ > done > > After this the livepatch has to be manualy disabled and removed > > $> echo 0 >/sys/kernel/livepatch/livepatch_sample/enabled > $> rmmod livepatch_sample Thanks for the suggestion. This is quite helpful for debug. I did some tests and in short, I could run all livepatch selftest successfully on clang15-built kernel when RANDOMIZE_KSTACK_OFFSET=n. Below is my analysis. Please let me know if I'm wrong. When I checked the stack state while being live-patched, I saw some tasks sleeping after system call are not transitioned. For example, I saw a task with following stack: ``` sshd [<0>] do_select+0x5cc/0x64c [<0>] core_sys_select+0x174/0x210 [<0>] __arm64_sys_pselect6+0x11c/0x384 [<0>] invoke_syscall+0x78/0x108 [<0>] el0_svc_common+0xc0/0xfc [<0>] do_el0_svc+0x38/0xd0 [<0>] el0_svc+0x34/0x110 [<0>] el0t_64_sync_handler+0x84/0xf0 [<0>] el0t_64_sync+0x190/0x194 ``` Then, I noticed that invoke_syscall generates instructions to add random offset in sp when RANDOMIZE_KSTACK_OFFSET=y, which is true in the above case. Actually I see that sp can be modified in the binary: ``` $ objdump -d vmlinux --disassemble=invoke_syscall ... ffff80000803076c <invoke_syscall>: ... ffff8000080307b4: 9100011f mov sp, x8 ... ffff80000803085c: d65f03c0 ret ``` This will set the instruction UNRELIABLE as sp value is not deterministic: https://github.com/madvenka786/linux/blob/orc_v3/tools/objtool/arch/arm64/decode.c#L173 and in turn will skip the generation of orc data: https://github.com/madvenka786/linux/blob/orc_v3/tools/objtool/dcheck.c#L313 I can confirm the orc result in vmlinux: ``` ./tools/objtool/objtool --dump vmlinux ... # no entry in range of invoke_syscall (ffff80000803076c - ffff80000803085c) ffff800008030764: cfa:sp+0 x29:cfa+0 type:call end:0 ffff800008030874: cfa:(und) x29:(und) type:call end:0 ffff800008030874: cfa:sp+0 x29:cfa+0 type:call end:0 ... ``` So, when live-patch is performed, stacktrace of task containing invoke_syscall cannot be validated in arch_stack_walk_reliable() and transition won't happen until the fake signal is delivered (unless task's state changes). It seems that stack validation itself works as intended. As I said, when RANDOMIZE_KSTACK_OFFSET=n, selftests run fine. Or am I misunderstood something completely? Regards, Tomohiro