Re: [PATCH v2 0/3] livepatch: Add "replace" sysfs attribute

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun 2024-06-23 10:52:40, Yafang Shao wrote:
> On Fri, Jun 21, 2024 at 6:04 PM Petr Mladek <pmladek@xxxxxxxx> wrote:
> > On Fri 2024-06-21 16:39:40, Yafang Shao wrote:
> > > On Fri, Jun 21, 2024 at 3:31 PM Petr Mladek <pmladek@xxxxxxxx> wrote:
> > > > On Tue 2024-06-11 10:46:47, Yafang Shao wrote:
> > > > > On Tue, Jun 11, 2024 at 1:19 AM Song Liu <song@xxxxxxxxxx> wrote:
> > > > > > On Sun, Jun 9, 2024 at 6:33 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote:
> > > They are not random live patches. Some system administrators maintain
> > > their own livepatches tailored for specific workloads or develop
> > > livepatches for A/B testing. Our company’s kernel team maintains the
> > > general livepatches. This worked well in the past when all livepatches
> > > were independent. However, the situation changed when we switched from
> > > non atomic replace livepatches to atomic replace livepatches, causing
> > > issues due to the uncertain behavior of mixed atomic replace and non
> > > atomic replace livepatches.
> >
> > I think that the uncertain behavior has been even before you started
> > using the atomic replace.
> >
> > How did the system administrators detect whether the livepatches were
> > independent?
> >
> > It looks to me that it worked only by luck. Well, I could imagine
> > that it has worked for a long time.
> 
> We have a limited number of livepatches in our system, primarily
> intended to address critical issues that could potentially cause
> system instability. Therefore, the likelihood of conflicts between two
> livepatches is relatively low. While I acknowledge that there is
> indeed a potential risk involved, it is the atomic replace livepatch
> that poses this risk.

I see.

> >
> >
> > > To address this change, we need a robust solution. One approach we
> > > have identified is developing a CI build system for livepatches. All
> > > livepatches must be built through this CI system, where they will
> > > undergo CI tests to verify if they are atomic replace or not.
> >
> > The important part is that the livepatch authors have to see
> > all already existing changes. They need to check that
> > the new change would not break other livepatched code and
> > vice versa.
> 
> Exactly. Through this CI system, all developers have visibility into
> all the livepatches currently running on our system.

Sounds good.

> > > Additionally, in our production environment, we need to ensure that
> > > there are no non atomic replace livepatches in use. For instance, some
> > > system administrators might still build livepatches outside of our CI
> > > system. Detecting whether a single livepatch is atomic replace or not
> > > is not easy. To simplify this, we propose adding a new sysfs attribute
> > > to facilitate this check.
> > >
> > > BTW, perhaps we could introduce a new sysctl setting to forcefully
> > > forbid all non atomic replace livepatches?
> >
> > I like it. This looks like the most reliable solution. Would it
> > solve your problem.
> >
> > Alternative solution would be to forbid installing non-replace
> > livepatches when there is already installed a livepatch with
> > the atomic replace. I could imagine that we enforce this for
> > everyone (without sysctl knob). Would this work for you?
> 
> Perhaps we can add this sysctl knob as follows?
> 
> kernel.livepatch.forbid_non_atomic_replace:
>     0 - Allow non atomic replace livepatch. (Default behavior)
>     1 - Completely forbid non atomic replace livepatch.
>     2 - Forbid non atomic replace livepatch only if there is already
> an atomic replace livepatch running.

Feel free to send a patch if it would be useful for you.

> > That said, I am not completely against adding the sysfs attribute
> > "replace". I could imagine that it might be useful when a more
> > flexible solution is needed. But I think that it would be
> > hard to make a more flexible solution safe, especially
> > in your environment.
> 
> This sysfs attribute remains valuable as it aids in monitoring
> livepatches, specifically allowing us to track which type of livepatch
> is currently operating on a server.

Fair enough. I am fine with it. I would only like to improve the
commit message explaining the motivation, see my reply to the patch.

Best Regards,
Petr




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux Kernel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux