Re: [PATCH v4 00/14] kexec: introduce Kexec HandOver (KHO)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Mike,

On Thu, Feb 6, 2025 at 5:28 AM Mike Rapoport <rppt@xxxxxxxxxx> wrote:
>
> From: "Mike Rapoport (Microsoft)" <rppt@xxxxxxxxxx>
>
> Hi,
>
> This a next version of Alex's "kexec: Allow preservation of ftrace buffers"
> series (https://lore.kernel.org/all/20240117144704.602-1-graf@xxxxxxxxxx),
> just to make things simpler instead of ftrace we decided to preserve
> "reserve_mem" regions.
>
> The patches are also available in git:
> https://git.kernel.org/rppt/h/kho/v4
>
>
> Kexec today considers itself purely a boot loader: When we enter the new
> kernel, any state the previous kernel left behind is irrelevant and the
> new kernel reinitializes the system.
>
> However, there are use cases where this mode of operation is not what we
> actually want. In virtualization hosts for example, we want to use kexec
> to update the host kernel while virtual machine memory stays untouched.
> When we add device assignment to the mix, we also need to ensure that
> IOMMU and VFIO states are untouched. If we add PCIe peer to peer DMA, we
> need to do the same for the PCI subsystem. If we want to kexec while an
> SEV-SNP enabled virtual machine is running, we need to preserve the VM
> context pages and physical memory. See "pkernfs: Persisting guest memory
> and kernel/device state safely across kexec" Linux Plumbers
> Conference 2023 presentation for details:
>
>   https://lpc.events/event/17/contributions/1485/
>
> To start us on the journey to support all the use cases above, this patch
> implements basic infrastructure to allow hand over of kernel state across
> kexec (Kexec HandOver, aka KHO). As a really simple example target, we use
> memblock's reserve_mem.
> With this patch set applied, memory that was reserved using "reserve_mem"
> command line options remains intact after kexec and it is guaranteed to
> reside at the same physical address.

Nice work!

One concern there is that using memblock to reserve memory as crashkernel=
is not flexible. I worked on kdump years ago and one of the biggest pains
of kdump is how much memory should be reserved with crashkernel=. And
it is still a pain today.

If we reserve more, that would mean more waste for the 1st kernel. If we
reserve less, that would induce more OOM for the 2nd kernel.

I'd suggest considering using CMA, where the "reserved" memory can be
still reusable for other purposes, just that pages can be migrated out of this
reserved region on demand, that is, when loading a kexec kernel. Of course,
we need to make sure they are not reused by what you want to preserve here,
e.g., IOMMU. So you might need additional work to make it work, but still I
believe this is the right direction.

Just my two cents.

Thanks!





[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux