On Mon, Jun 24, 2024 at 10:04 PM Petr Mladek <pmladek@xxxxxxxx> wrote: > > On Sun 2024-06-23 10:52:40, Yafang Shao wrote: > > On Fri, Jun 21, 2024 at 6:04 PM Petr Mladek <pmladek@xxxxxxxx> wrote: > > > On Fri 2024-06-21 16:39:40, Yafang Shao wrote: > > > > On Fri, Jun 21, 2024 at 3:31 PM Petr Mladek <pmladek@xxxxxxxx> wrote: > > > > > On Tue 2024-06-11 10:46:47, Yafang Shao wrote: > > > > > > On Tue, Jun 11, 2024 at 1:19 AM Song Liu <song@xxxxxxxxxx> wrote: > > > > > > > On Sun, Jun 9, 2024 at 6:33 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > > > > They are not random live patches. Some system administrators maintain > > > > their own livepatches tailored for specific workloads or develop > > > > livepatches for A/B testing. Our company’s kernel team maintains the > > > > general livepatches. This worked well in the past when all livepatches > > > > were independent. However, the situation changed when we switched from > > > > non atomic replace livepatches to atomic replace livepatches, causing > > > > issues due to the uncertain behavior of mixed atomic replace and non > > > > atomic replace livepatches. > > > > > > I think that the uncertain behavior has been even before you started > > > using the atomic replace. > > > > > > How did the system administrators detect whether the livepatches were > > > independent? > > > > > > It looks to me that it worked only by luck. Well, I could imagine > > > that it has worked for a long time. > > > > We have a limited number of livepatches in our system, primarily > > intended to address critical issues that could potentially cause > > system instability. Therefore, the likelihood of conflicts between two > > livepatches is relatively low. While I acknowledge that there is > > indeed a potential risk involved, it is the atomic replace livepatch > > that poses this risk. > > I see. > > > > > > > > > > > To address this change, we need a robust solution. One approach we > > > > have identified is developing a CI build system for livepatches. All > > > > livepatches must be built through this CI system, where they will > > > > undergo CI tests to verify if they are atomic replace or not. > > > > > > The important part is that the livepatch authors have to see > > > all already existing changes. They need to check that > > > the new change would not break other livepatched code and > > > vice versa. > > > > Exactly. Through this CI system, all developers have visibility into > > all the livepatches currently running on our system. > > Sounds good. > > > > > Additionally, in our production environment, we need to ensure that > > > > there are no non atomic replace livepatches in use. For instance, some > > > > system administrators might still build livepatches outside of our CI > > > > system. Detecting whether a single livepatch is atomic replace or not > > > > is not easy. To simplify this, we propose adding a new sysfs attribute > > > > to facilitate this check. > > > > > > > > BTW, perhaps we could introduce a new sysctl setting to forcefully > > > > forbid all non atomic replace livepatches? > > > > > > I like it. This looks like the most reliable solution. Would it > > > solve your problem. > > > > > > Alternative solution would be to forbid installing non-replace > > > livepatches when there is already installed a livepatch with > > > the atomic replace. I could imagine that we enforce this for > > > everyone (without sysctl knob). Would this work for you? > > > > Perhaps we can add this sysctl knob as follows? > > > > kernel.livepatch.forbid_non_atomic_replace: > > 0 - Allow non atomic replace livepatch. (Default behavior) > > 1 - Completely forbid non atomic replace livepatch. > > 2 - Forbid non atomic replace livepatch only if there is already > > an atomic replace livepatch running. > > Feel free to send a patch if it would be useful for you. I will do it in a separate patch. -- Regards Yafang