On Wed, May 04, 2022 at 03:07:53PM +0200, Petr Mladek wrote: > On Tue 2022-05-03 12:49:34, Seth Forshee wrote: > > A task can be livepatched only when it is sleeping or it exits to > > userspace. This may happen infrequently for a heavily loaded vCPU task, > > leading to livepatch transition failures. > > This is misleading. > > First, the problem is not a loaded CPU. The problem is that the > task might spend very long time in the kernel when handling > some syscall. It's a fully loaded vCPU, which yes to the host looks like spending a very long time in the ioctl(KVM_RUN) syscall. I can reword to clarify. > Second, there is no timeout for the transition in the kernel code. > It might take very long time but it will not fail. I suppose the timeout is in kpatch then. I didn't check what implemented the timeout. I'll remove the statement about timing out. > > Fake signals will be sent to tasks which fail patching via stack > > checking. This will cause running vCPU tasks to exit guest mode, but > > since no signal is pending they return to guest execution without > > exiting to userspace. Fix this by treating a pending livepatch migration > > like a pending signal, exiting to userspace with EINTR. This allows the > > task to be patched, and userspace should re-excecute KVM_RUN to resume > > guest execution. > > It seems that the patch works as expected but it is far from clear. > And the above description helps only partially. Let me try to > explain it for dummies like me ;-) > > <explanation> > The problem was solved by sending a fake signal, see the commit > 0b3d52790e1cfd6b80b826 ("livepatch: Remove signal sysfs attribute"). > It was achieved by calling signal_wake_up(). It set TIF_SIGPENDING > and woke the task. It interrupted the syscall and the task was > transitioned when leaving to the userspace. > > signal_wake_up() was later replaced by set_notify_signal(), > see the commit 8df1947c71ee53c7e21 ("livepatch: Replace > the fake signal sending with TIF_NOTIFY_SIGNAL infrastructure"). > The difference is that set_notify_signal() uses TIF_NOTIFY_SIGNAL > instead of TIF_SIGPENDING. > > The effect is the same when running on a real hardware. The syscall > gets interrupted and exit_to_user_mode_loop() is called where > the livepatch state is updated (task migrated). > > But it works a different way in kvm where the task works are > called in the guest mode and the task does not return into > the user space in the host mode. > </explanation> Thanks, I can update the commit message to include more of this background. > > The solution provided by this patch is a bit weird, see below. > > > > In my testing, systems where livepatching would timeout after 60 seconds > > were able to load livepatches within a couple of seconds with this > > change. > > > > Signed-off-by: Seth Forshee <sforshee@xxxxxxxxxxxxxxxx> > > --- > > Changes in v2: > > - Added _TIF_SIGPENDING to XFER_TO_GUEST_MODE_WORK > > - Reworded commit message and comments to avoid confusion around the > > term "migrate" > > > > include/linux/entry-kvm.h | 4 ++-- > > kernel/entry/kvm.c | 7 ++++++- > > 2 files changed, 8 insertions(+), 3 deletions(-) > > > > diff --git a/include/linux/entry-kvm.h b/include/linux/entry-kvm.h > > index 6813171afccb..bf79e4cbb5a2 100644 > > --- a/include/linux/entry-kvm.h > > +++ b/include/linux/entry-kvm.h > > @@ -17,8 +17,8 @@ > > #endif > > > > #define XFER_TO_GUEST_MODE_WORK \ > > - (_TIF_NEED_RESCHED | _TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL | \ > > - _TIF_NOTIFY_RESUME | ARCH_XFER_TO_GUEST_MODE_WORK) > > + (_TIF_NEED_RESCHED | _TIF_SIGPENDING | _TIF_PATCH_PENDING | \ > > + _TIF_NOTIFY_SIGNAL | _TIF_NOTIFY_RESUME | ARCH_XFER_TO_GUEST_MODE_WORK) > > > > struct kvm_vcpu; > > > > diff --git a/kernel/entry/kvm.c b/kernel/entry/kvm.c > > index 9d09f489b60e..98439dfaa1a0 100644 > > --- a/kernel/entry/kvm.c > > +++ b/kernel/entry/kvm.c > > @@ -14,7 +14,12 @@ static int xfer_to_guest_mode_work(struct kvm_vcpu *vcpu, unsigned long ti_work) > > task_work_run(); > > } > > > > - if (ti_work & _TIF_SIGPENDING) { > > + /* > > + * When a livepatch is pending, force an exit to userspace > > + * as though a signal is pending to allow the task to be > > + * patched. > > + */ > > + if (ti_work & (_TIF_SIGPENDING | _TIF_PATCH_PENDING)) { > > kvm_handle_signal_exit(vcpu); > > return -EINTR; > > } > > This looks strange: > > + klp_send_signals() calls set_notify_signal(task) that sets > TIF_NOTIFY_SIGNAL > > + xfer_to_guest_mode_work() handles TIF_NOTIFY_SIGNAL by calling > task_work_run(). > > + This patch calls kvm_handle_signal_exit(vcpu) when > _TIF_PATCH_PENDING is set. It probably causes the guest > to call exit_to_user_mode_loop() because TIF_PATCH_PENDING > bit is set. But neither TIF_NOTIFY_SIGNAL not TIF_NOTIFY_SIGNAL > is set so that it works different way than on the real hardware. > > > Question: > > Does xfer_to_guest_mode_work() interrupts the syscall running > on the guest? xfer_to_guest_mode_work() is called as part of a loop to execute kvm guests (for example, on x86 see vcpu_run() in arch/x86/kvm/x86.c). When guest execution is interrupted (in the livepatch case it is interrupted when set_notify_signal() is called for the vCPU task) xfer_to_guest_mode_work() is called if there is pending work, and if it returns non-zero the loop does not immediately re-enter guest execution but instead returns to userspace. > If "yes" then we do not need to call kvm_handle_signal_exit(vcpu). > It will be enough to call: > > if (ti_work & _TIF_PATCH_PENDING) > klp_update_patch_state(current); What if the task's call stack contains a function being patched? > > If "no" then I do not understand why TIF_NOTIFY_SIGNAL interrupts > the syscall on the real hardware and not in kvm. It does interrupt, but xfer_to_guest_mode_handle_work() concludes it's not necessary to return to userspace and resumes guest execution. Thanks, Seth > Anyway, we either should make sure that TIF_NOTIFY_SIGNAL has the same > effect on the real hardware and in kvm. Or we need another interface > for the fake signal used by livepatching. > > Adding Jens Axboe and Eric into Cc. > > Best Regards, > Petr