On Wed, May 8, 2024 at 1:16 PM Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote: > > On Tue, May 07, 2024 at 10:03:59PM +0800, Yafang Shao wrote: > > On Tue, May 7, 2024 at 10:35 AM Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote: > > > User space needs to be polling for the transition to complete so it can > > > reverse the patch if it stalls. Otherwise the patch could stall forever > > > and go unnoticed. > > > > > > Can't user space just unload the replaced module after it detects the > > > completed transition? > > > > Are you referring to polling the > > "/sys/kernel/livepatch/XXX/transition"? The challenge lies in the > > uncertainty regarding which livepatches will be replaced and how many. > > Even if we can poll the transition status, there's no guarantee that a > > livepatch will be replaced by this operation. > > If klp_patch.replace is set on the new patch then it will replace all > previous patches. A scenario exists wherein a user could simultaneously disable a loaded livepatch, potentially resulting in it not being replaced by the new patch. While theoretical, this possibility is not entirely implausible. > > If the replaced patches remain in /sys/livepatch then you can simply > unload all the patches with enabled == 0, after the new patch succeeds > (enabled == 1 && transition == 0). > > > > I'm not sure I see the benefit in complicating the kernel and possibly > > > introducing bugs, when unloading the module from user space seems to be > > > a perfectly valid option. > > > > > > Also, an error returned by delete_module() to the kernel would be > > > ignored and the module might remain in memory forever without being > > > noticed. > > > > As Petr pointed out, we can enhance the functionality by checking the > > return value and providing informative error messages. This aligns > > with the user experience when deleting a module; if deletion fails, > > users have the option to try again. Similarly, if error messages are > > displayed, users can manually remove the module if needed. > > Calling delete_module() from the kernel means there's no syscall with > which to return an error back to the user. pr_error() can calso tell the user the error, right ? If we must return an error to the user, probably we can use klp_free_replaced_patches_sync() instead of klp_free_replaced_patches_async(). Under what conditions might a kernel module of a disabled livepatch be unable to be removed? -- Regards Yafang