On Tue, 2022-07-26 at 17:50 -0700, Dave Hansen wrote: > On 7/26/22 17:34, Kai Huang wrote: > > > This doesn't seem right to me. *If* we get a known-bogus > > > hot-remove event, we need to reject it. Remember, removal is a > > > two-step process. > > If so, we need to reject the (CMR) memory offline. Or we just BUG() > > in the ACPI memory removal callback? > > > > But either way this will requires us to get the CMRs during kernel boot. > > I don't get the link there between CMRs at boot and handling hotplug. > > We don't need to go to extreme measures just to get a message out of the > kernel that the BIOS is bad. If we don't have the data to do it > already, then I don't really see the nee to warn about it. > > Think of a system that has TDX enabled in the BIOS, but is running an > old kernel. It will have *ZERO* idea that hotplug doesn't work. It'll > run blissfully along. I don't see any reason that a kernel with TDX > support, but where TDX is disabled should actively go out and try to be > better than those old pre-TDX kernels. Agreed, assuming "where TDX is disabled" you mean TDX isn't usable (i.e. when TDX module isn't loaded, or won't be initialized at all). > > Further, there's nothing to stop non-CMR memory from being added to a > system with TDX enabled in the BIOS but where the kernel is not using > it. If we actively go out and keep good old DRAM from being added, then > we unnecessarily addle those systems. > OK. Then for memory hot-add, perhaps we can just go with the "winner-take-all" approach you mentioned before? For memory hot-removal, as I replied previously, looks the kernel cannot reject the removal if it allows memory offline. Any suggestion on this? -- Thanks, -Kai