Re: [PATCH 0/3] kexec/memory_hotplug: Prevent removal and accidental use

James Morse <james.morse@xxxxxxx> · Fri, 27 Mar 2020 15:40:24 +0000

Hi Baoquan,

On 3/27/20 2:11 AM, Baoquan He wrote:
> On 03/26/20 at 06:07pm, James Morse wrote:
>> arm64 recently queued support for memory hotremove, which led to some
>> new corner cases for kexec.
>>
>> If the kexec segments are loaded for a removable region, that region may
>> be removed before kexec actually occurs. This causes the first kernel to
>> lockup when applying the relocations. (I've triggered this on x86 too).

> Do you mean you use 'kexec -l /boot/vmlinuz-xxxx --initrd ...' to load a
> kernel, next you hot remove some memory regions, then you execute
> 'kexec -e' to trigger kexec reboot?

Yes. But to make it more fun, get someone else to trigger the hot-remove behind
your back!

> I may not get the point clearly, but we usually do the loading and
> triggering of kexec-ed kernel at the same time. 

But its two syscalls. Should the second one fail if the memory layout has
changed since the first?

(UEFI does this for exit-boot-services, there is handshake to prove you know
what the current memory map is)

>> The first patch adds a memory notifier for kexec so that it can refuse
>> to allow in-use regions to be taken offline.
>>
>>
>> This doesn't solve the problem for arm64, where the new kernel must
>> initially rely on the data structures from the first boot to describe
>> memory. These don't describe hotpluggable memory.
>> If kexec places the kernel in one of these regions, it must also provide
>> a DT that describes the region in which the kernel was mapped as memory.
>> (and somehow ensure its always present in the future...)
>>
>> To prevent this from happening accidentally with unaware user-space,
>> patches two and three allow arm64 to give these regions a different
>> name.
>>
>> This is a change in behaviour for arm64 as memory hotadd and hotremove
>> were added separately.
>>
>>
>> I haven't tried kdump.
>> Unaware kdump from user-space probably won't describe the hotplug
>> regions if the name is different, which saves us from problems if
>> the memory is no longer present at kdump time, but means the vmcore
>> is incomplete.

> Currently, we will monitor udev events of mem hot add/remove, then
> reload kdump kernel. That reloading is only update the elfcorehdr,
> because crashkernel has to be reserved during 1st kernel bootup. I don't
> think this will have problem.

Great. I don't think there is much the kernel can do for the kdump case, so its
good to know the tools already exist for detecting and restarting the kdump load
when the memory layout changes.

For kdump via kexec-file-load, we would need to regenerate the elfcorehdr, I'm
hoping that can be done in core code.

Thanks,

James