Re: [PATCH v4 00/14] kexec: introduce Kexec HandOver (KHO)

Pratyush Yadav <pratyush@xxxxxxxxxx> · Fri, 28 Feb 2025 23:04:48 +0000

On Fri, Feb 28 2025, Mike Rapoport wrote:

> Hi Pratyush,
>
> On Wed, Feb 26, 2025 at 08:08:27PM +0000, Pratyush Yadav wrote:
>> Hi Mike,
>> 
>> On Thu, Feb 06 2025, Mike Rapoport wrote:
>> 
>> > From: "Mike Rapoport (Microsoft)" <rppt@xxxxxxxxxx>
>> >
>> > Hi,
>> >
>> > This a next version of Alex's "kexec: Allow preservation of ftrace buffers"
>> > series (https://lore.kernel.org/all/20240117144704.602-1-graf@xxxxxxxxxx),
>> > just to make things simpler instead of ftrace we decided to preserve
>> > "reserve_mem" regions.
>> [...]
>> 
>> I applied the patches on top of v6.14-rc1 and tried them out on an x86
>> qemu machine . When I do a plain KHO activate and kexec, I get the below
>> errors on boot. This causes networking to fail on the VM. The errors are
>> consistent and happen every kexec-reboot, though fairly late in boot
>> after systemd tries to bring up network. The same setup has worked fine
>> with Alex's v3 of KHO patches.
>> 
>> Do you see anything obvious that might cause this? I can try to debug
>> this tomorrow, but if it rings any loud bells it would be nice to know.
>
> Thanks for the report!
> It didn't ring any bells, but after I've found the issue and a
> fast-and-dirty fix.
>
> The scratch areas are allocated from high addresses and there is no scratch
> memory to satisfy memblock_alloc_low() in swiotb, so second kernel produces
> a couple of 
>
> software IO TLB: swiotlb_memblock_alloc: Failed to allocate 67108864 bytes for tlb structure

I also did some digging today and ended up finding the same thing out
but it seems you got there before me :-)

>
> and without those buffers e1000 can't dma :(
>
> A quick fix would be to add another scratch area in the lower memory
> (below). I'll work on a better fix.

I have already written a less-quick fix (patch pasted below) so I
suppose we can use that to review the idea instead. It adds a dedicated
scratch area for lowmem, similar to your patch, and adds some tracking
to calculate the size.

I am not sure if the size estimation is completely right though, since
it is possible that allocations that don't _need_ to be in lowmem end up
being there, causing the scratch area to be too big (or perhaps even
causing allocation failures if the scale is big enough). Maybe we would
be better off tracking lowmem allocation requests separately?

----- 8< -----