Re: [Question] Memory hotplug clarification for Qemu ARM/virt

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/08/19 14:50, Robin Murphy wrote:
> Hi Shameer,
> 
> On 08/05/2019 11:15, Shameerali Kolothum Thodi wrote:
>> Hi,
>>
>> This series here[0] attempts to add support for PCDIMM in QEMU for
>> ARM/Virt platform and has stumbled upon an issue as it is not clear(at
>> least
>> from Qemu/EDK2 point of view) how in physical world the hotpluggable
>> memory is handled by kernel.
>>
>> The proposed implementation in Qemu, builds the SRAT and DSDT parts
>> and uses GED device to trigger the hotplug. This works fine.
>>
>> But when we added the DT node corresponding to the PCDIMM(cold plug
>> scenario), we noticed that Guest kernel see this memory during early boot
>> even if we are booting with ACPI. Because of this, hotpluggable memory
>> may end up in zone normal and make it non-hot-un-pluggable even if Guest
>> boots with ACPI.
>>
>> Further discussions[1] revealed that, EDK2 UEFI has no means to
>> interpret the
>> ACPI content from Qemu(this is designed to do so) and uses DT info to
>> build the GetMemoryMap(). To solve this, introduced "hotpluggable"
>> property
>> to DT memory node(patches #7 & #8 from [0]) so that UEFI can
>> differentiate
>> the nodes and exclude the hotpluggable ones from GetMemoryMap().
>>
>> But then Laszlo rightly pointed out that in order to accommodate the
>> changes
>> into UEFI we need to know how exactly Linux expects/handles all the
>> hotpluggable memory scenarios. Please find the discussion here[2].
>>
>> For ease, I am just copying the relevant comment from Laszlo below,
>>
>> /******
>> "Given patches #7 and #8, as I understand them, the firmware cannot
>> distinguish
>>   hotpluggable & present, from hotpluggable & absent. The firmware can
>> only
>>   skip both hotpluggable cases. That's fine in that the firmware will
>> hog neither
>>   type -- but is that OK for the OS as well, for both ACPI boot and DT
>> boot?
>>
>> Consider in particular the "hotpluggable & present, ACPI boot" case.
>> Assuming
>> we modify the firmware to skip "hotpluggable" altogether, the UEFI memmap
>> will not include the range despite it being present at boot.
>> Presumably, ACPI
>> will refer to the range somehow, however. Will that not confuse the OS?
>>
>> When Igor raised this earlier, I suggested that
>> hotpluggable-and-present should
>> be added by the firmware, but also allocated immediately, as
>> EfiBootServicesData
>> type memory. This will prevent other drivers in the firmware from
>> allocating AcpiNVS
>> or Reserved chunks from the same memory range, the UEFI memmap will
>> contain
>> the range as EfiBootServicesData, and then the OS can release that
>> allocation in
>> one go early during boot.
>>
>> But this really has to be clarified from the Linux kernel's
>> expectations. Please
>> formalize all of the following cases:
>>
>> OS boot (DT/ACPI)  hotpluggable & ...  GetMemoryMap() should report
>> as  DT/ACPI should report as
>> -----------------  ------------------ 
>> -------------------------------  ------------------------
>> DT                 present             ?                                ?
>> DT                 absent              ?                                ?
>> ACPI               present             ?                                ?
>> ACPI               absent              ?                                ?
>>
>> Again, this table is dictated by Linux."
>>
>> ******/
>>
>> Could you please take a look at this and let us know what is expected
>> here from
>> a Linux kernel view point.
> 
> For arm64, so far we've not even been considering DT-based hotplug - as
> far as I'm aware there would still be a big open question there around
> notification mechanisms and how to describe them. The DT stuff so far
> has come from the PowerPC folks, so it's probably worth seeing what
> their ideas are.
> 
> ACPI-wise I've always assumed/hoped that hotplug-related things should
> be sufficiently well-specified in UEFI that "do whatever x86/IA-64 do"
> would be enough for us.

As far as I can see in UEFI v2.8 -- and I had checked the spec before
dumping the table with the many question marks on Shameer --, all the
hot-plug language in the spec refers to USB and PCI hot-plug in the
preboot environment. There is not a single word about hot-plug at OS
runtime (regarding any device or component type), nor about memory
hot-plug (at any time).

Looking to x86 appears valid -- so what does the Linux kernel expect on
that architecture, in the "ACPI" rows of the table?

Shameer: if you (Huawei) are represented on the USWG / ASWG, I suggest
re-raising the question on those lists too; at least the "ACPI" rows of
the table.

Thanks!
Laszlo

> 
> Robin.
> 
>> (Hi Laszlo/Igor/Eric, please feel free to add/change if I have missed
>> any valid
>> points above).
>>
>> Thanks,
>> Shameer
>> [0] https://patchwork.kernel.org/cover/10890919/
>> [1] https://patchwork.kernel.org/patch/10863299/
>> [2] https://patchwork.kernel.org/patch/10890937/
>>
>>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux