On Fri, Apr 1, 2022 at 1:40 PM Davidlohr Bueso <dave@xxxxxxxxxxxx> wrote: > > On Thu, 31 Mar 2022, Vishal Verma wrote: > > >The CXL specification does not define any additional constraints on > >the hotplug flow beyond PCIe native hotplug, so a kernel that supports > >native PCIe hotplug, supports CXL hotplug. > > Hmm but from a Linux-pov does it make sense to allow hotplug > support if the MM cannot handle it? I would say yes, i.e. do not consider CONFIG_MEMORY_HOTPLUG for OSC_CXL_NATIVE_HP_SUPPORT, but see below and poke holes in my argument... > > @@ -531,7 +518,8 @@ static u32 calculate_cxl_support(void) > support = OSC_CXL_2_0_PORT_DEV_REG_ACCESS_SUPPORT; > if (pci_aer_available()) > support |= OSC_CXL_PROTOCOL_ERR_REPORTING_SUPPORT; > - if (IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE)) > + if (IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE) && > + IS_ENABLED(CONFIG_MEMORY_HOTPLUG)) > support |= OSC_CXL_NATIVE_HP_SUPPORT; > > return support; > > After all, per the CXL 2.0 Type 3 device Hot-Add flow: > > "" > 7. CXL aware software notifies OS memory manager about the new memory and its > attributes such as latency and bandwidth. Memory manager processes a request > and adds the new memory to its allocation pool. > "" If I look at ACPI hotplug it is true that CONFIG_ACPI_HOTPLUG_MEMORY depends on CONFIG_MEMORY_HOTPLUG. However, it is also true that there is no existing _OSC for memory hotplug support. The reason is that ACPI memory hotplug requires the OS to acknowledge / coordinate with memory plug events via a scan handler. On the CXL side the equivalent would be if Linux supported the Mechanical Retention Lock [1], or otherwise had some coordination for the driver of a PCI device undergoing hotplug to be consulted on whether the hotplug should proceed or not. The concern is that if Linux says no to supporting CXL hotplug then the BIOS may say no to giving the OS hotplug control of any other PCIe device. So the question here is not whether hotplug is enabled, it's whether it is handled natively by the at all OS, and if CONFIG_HOTPLUG_PCI_PCIE is enabled then the answer is "yes". Otherwise, the plan for CXL coordinated remove, since the kernel does not support blocking hotplug, is to require the memory device to be disabled before hotplug is attempted. When CONFIG_MEMORY_HOTPLUG is disabled that step will fail and the remove attempt cancelled. by the user. If that is not honored and the card is removed anyway then it does not matter if CONFIG_MEMORY_HOTPLUG is enabled or not, it will cause a crash and other badness. Long story short, just say yes, to CXL hotplug and require removal to be coordinated by userspace unless and until the kernel grows better mechanisms for doing "managed" removal of devices in consultation with the driver. [1]: https://lore.kernel.org/all/20201122014203.4706-1-ashok.raj@xxxxxxxxx/