Re: [PATCH -V3] x86/tdx, memory hotplug: Check whole hot-adding memory range for TDX

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, Oscar,

Oscar Salvador <osalvador@xxxxxxx> writes:

> On Tue, Oct 22, 2024 at 11:16:17AM +0800, Huang Ying wrote:
>> On systems with TDX (Trust Domain eXtensions) enabled, current kernel
>> checks the TDX compatibility of the hot-added memory ranges through a
>> memory hotplug notifier for each memory_block.  If a memory range
>> which isn't TDX compatible is hot-added, for example, some CXL memory,
>> the command line as follows,
>> 
>>   $ echo 1 > /sys/devices/system/node/nodeX/memoryY/online
>> 
>> will report something like,
>> 
>>   bash: echo: write error: Operation not permitted
>> 
>> If pr_debug() is enabled, current kernel will show the error message
>> like below in the kernel log,
>> 
>>   online_pages [mem 0xXXXXXXXXXX-0xXXXXXXXXXX] failed
>> 
>> Both are too general to root cause the problem.  This may confuse
>> users.  One solution is to print some error messages in the TDX memory
>> hotplug notifier.  However, kernel calls memory hotplug notifiers for
>> each memory block, so this may lead to a large volume of messages in
>> the kernel log if a large number of memory blocks are onlined with a
>> script or automatically.  For example, the typical size of memory
>> block is 128MB on x86_64, when online 64GB CXL memory, 512 messages
>> will be logged.
>> 
>> Therefore, this patch checks the TDX compatibility of the whole
>> hot-adding memory range through a newly added architecture specific
>> function (arch_check_hotplug_memory_range()).  If this patch rejects
>> the memory hot-adding for TDX compatibility, it will output a kernel
>> log message like below,
>> 
>>   virt/tdx: Reject hot-adding memory range: 0xXXXXXXXX-0xXXXXXXXX for TDX compatibility.
>> 
>> The target use case is to support CXL memory on TDX enabled systems.
>> If the CXL memory isn't compatible with TDX, the kernel will reject
>> the whole CXL memory range.  While the CXL memory can still be used
>> via devdax interface.
>> 
>> This also makes the original TDX memory hotplug notifier useless, so
>> this patch deletes it.
>> 
>> Signed-off-by: "Huang, Ying" <ying.huang@xxxxxxxxx>
>
> Acked-by: Oscar Salvador <osalvador@xxxxxxx>

Thanks!

> One question below:
>
> ...
>
>> +int tdx_check_hotplug_memory_range(u64 start, u64 size)
>>  {
>> -	struct memory_notify *mn = v;
>> -
>> -	if (action != MEM_GOING_ONLINE)
>> -		return NOTIFY_OK;
>> +	u64 start_pfn = PHYS_PFN(start);
>> +	u64 end_pfn = PHYS_PFN(start + size);
>>  
>>  	/*
>>  	 * Empty list means TDX isn't enabled.  Allow any memory
>> -	 * to go online.
>> +	 * to be hot-added.
>>  	 */
>>  	if (list_empty(&tdx_memlist))
>> -		return NOTIFY_OK;
>> +		return 0;
>>  
>>  	/*
>>  	 * The TDX memory configuration is static and can not be
>> -	 * changed.  Reject onlining any memory which is outside of
>> +	 * changed.  Reject hot-adding any memory which is outside of
>>  	 * the static configuration whether it supports TDX or not.
>>  	 */
>> -	if (is_tdx_memory(mn->start_pfn, mn->start_pfn + mn->nr_pages))
>> -		return NOTIFY_OK;
>> +	if (is_tdx_memory(start_pfn, end_pfn))
>> +		return 0;
>>  
>> -	return NOTIFY_BAD;
>> +	pr_info("Reject hot-adding memory range: %#llx-%#llx for TDX compatibility.\n",
>> +		start, start + size);
>
> Why not using pr_err() here?
>
> I was checking which kind of information level we use when failing at
> hot-adding memory, and we seem to be using pr_err(), and pr_debug() when
> onlining/offlining.
>
> Not a big deal, and not saying it is wrong, but was just wondering the reasoning
> behind.

TBH, I have no strong opinion about which log level is more appropriate.
IMHO, it shouldn't be pr_debug() to make it easy for users to root cause
the hot-adding failure.  And, it appears too harsh to use pr_err(),
because there's no program error, etc.  So, I think that something
in-between is more appropriate.  That is, pr_warn(), pr_notice, or
pr_info().  In them, I prefer pr_info() a little.

--
Best Regards,
Huang, Ying




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux