Re: [PATCH 1/3] kexec: Prevent removal of memory in use by a loaded kexec image

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Anshuman,

On 3/27/20 12:43 AM, Anshuman Khandual wrote:
> On 03/26/2020 11:37 PM, James Morse wrote:
>> An image loaded for kexec is not stored in place, instead its segments
>> are scattered through memory, and are re-assembled when needed. In the
>> meantime, the target memory may have been removed.
>>
>> Because mm is not aware that this memory is still in use, it allows it
>> to be removed.

> Why the isolation process does not fail when these pages are currently
> being used by kexec ?

Kexec isn't using them right now, but it will once kexec is triggered.
Those physical addresses are held in some internal kexec data structures until
kexec is triggered.


>> Add a memory notifier to prevent the removal of memory regions that
>> overlap with a loaded kexec image segment. e.g., when triggered from the
>> Qemu console:
>> | kexec_core: memory region in use
>> | memory memory32: Offline failed.
> 
> Yes this is definitely an added protection for these kexec loaded kernels
> memory areas from being offlined but I would have expected the preceding
> offlining to have failed as well.

kexec hasn't allocate the memory, part of the regions user-space may specify for
the next kernel may be in use. There is nothing to stop the memory being used in
the meantime.


Any other way of doing this would prevent us saying why it failed.
Like this, the user can spot the 'kexec: Memory region in use message', and
unload kexec.


>> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
>> index c19c0dad1ebe..ba1d91e868ca 100644
>> --- a/kernel/kexec_core.c
>> +++ b/kernel/kexec_core.c
>> @@ -1219,3 +1222,56 @@ void __weak arch_kexec_protect_crashkres(void)
>>  
>>  void __weak arch_kexec_unprotect_crashkres(void)
>>  {}
>> +
>> +/*
>> + * If user-space wants to offline memory that is in use by a loaded kexec
>> + * image, it should unload the image first.
>> + */

> Probably this would need kexec user manual and related system call man pages
> update as well.

I can't see anything relevant under Documentation. (kdump yes, kexec no...)


>> +static int mem_remove_cb(struct notifier_block *nb, unsigned long action,
>> +			 void *data)
>> +{
>> +	int rv = NOTIFY_OK, i;
>> +	struct memory_notify *arg = data;
>> +	unsigned long pfn = arg->start_pfn;
>> +	unsigned long nr_segments, sstart, send;
>> +	unsigned long end_pfn = arg->start_pfn + arg->nr_pages;
>> +
>> +	might_sleep();
> 
> Required ?

Habit, and I think best practice. We take a mutex, so might_sleep(), but we also
conditionally return before lockdep would see the mutex. Having this annotation
means a dangerous change to the way this is called triggers a warning without
having to test memory hotplug explicitly.


>> +
>> +	if (action != MEM_GOING_OFFLINE)
>> +		return NOTIFY_DONE;
>> +
>> +	mutex_lock(&kexec_mutex);
>> +	if (kexec_image) {
>> +		nr_segments = kexec_image->nr_segments;
>> +
>> +		for (i = 0; i < nr_segments; i++) {
>> +			sstart = PFN_DOWN(kexec_image->segment[i].mem);
>> +			send = PFN_UP(kexec_image->segment[i].mem +
>> +				      kexec_image->segment[i].memsz);
>> +
>> +			if ((pfn <= sstart && sstart < end_pfn) ||
>> +			    (pfn <= send && send < end_pfn)) {
>> +				pr_warn("Memory region in use\n");
>> +				rv = NOTIFY_BAD;
>> +				break;
>> +			}
>> +		}
>> +	}
>> +	mutex_unlock(&kexec_mutex);
>> +
>> +	return rv;
> 
> Variable 'rv' is redundant, should use NOTIFY_[BAD|OK] directly instead.

You'd prefer a mutex_unlock() in the middle of the loop? ... or goto?
(I'm not convinced)

>> +}
>> +
>> +static struct notifier_block mem_remove_nb = {
>> +	.notifier_call = mem_remove_cb,
>> +};
>> +
>> +static int __init register_mem_remove_cb(void)
>> +{
>> +	if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG))
> 
> Should not all these new code here be wrapped with CONFIG_MEMORY_HOTREMOVE
> to reduce the scope as well as final code size when the config is disabled.

The compiler is really good at this. "if (false)" means this is all dead code,
and static means its not exported, so the compiler is free to remove it.

Not #ifdef-ing it makes it much more readable, and means the compiler checks its
valid C before it removes it. This avoids weird header include problems that
only show up on some rand-config builds.


Thanks,

James

>> +		return register_memory_notifier(&mem_remove_nb);
>> +
>> +	return 0;
>> +}
>> +device_initcall(register_mem_remove_cb);
>>
> 





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux