Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> writes: > On Tue, 09 Jan 2024 11:41:11 +0800 > "Huang, Ying" <ying.huang@xxxxxxxxx> wrote: > >> Gregory Price <gregory.price@xxxxxxxxxxxx> writes: >> >> > On Thu, Jan 04, 2024 at 02:05:01PM +0800, Huang, Ying wrote: >> >> > >> >> > From https://lpc.events/event/16/contributions/1209/attachments/1042/1995/Live%20In%20a%20World%20With%20Multiple%20Memory%20Types.pdf >> >> > abstract_distance_offset: override by users to deal with firmware issue. >> >> > >> >> > say firmware can configure the cxl node into wrong tiers, similar to >> >> > that it may also configure all cxl nodes into single memtype, hence >> >> > all these nodes can fall into a single wrong tier. >> >> > In this case, per node adistance_offset would be good to have ? >> >> >> >> I think that it's better to fix the error firmware if possible. And >> >> these are only theoretical, not practical issues. Do you have some >> >> practical issues? >> >> >> >> I understand that users may want to move nodes between memory tiers for >> >> different policy choices. For that, memory_type based adistance_offset >> >> should be good. >> >> >> > >> > There's actually an affirmative case to change memory tiering to allow >> > either movement of nodes between tiers, or at least base placement on >> > HMAT information. Preferably, membership would be changable to allow >> > hotplug/DCD to be managed (there's no guarantee that the memory passed >> > through will always be what HMAT says on initial boot). >> >> IIUC, from Jonathan Cameron as below, the performance of memory >> shouldn't change even for DCD devices. >> >> https://lore.kernel.org/linux-mm/20231103141636.000007e4@xxxxxxxxxx/ >> >> It's possible to change the performance of a NUMA node changed, if we >> hot-remove a memory device, then hot-add another different memory >> device. It's hoped that the CDAT changes too. > > Not supported, but ACPI has _HMA methods to in theory allow changing > HMAT values based on firmware notifications... So we 'could' make > it work for HMAT based description. > > Ultimately my current thinking is we'll end up emulating CXL type3 > devices (hiding topology complexity) and you can update CDAT but > IIRC that is only meant to be for degraded situations - so if you > want multiple performance regions, CDAT should describe them form the start. Thank you very much for input! So, to support degraded performance, we will need to move a NUMA node between memory tiers. And, per my understanding, we should do that in kernel. >> >> So, all in all, HMAT + CDAT can help us to put the memory device in >> appropriate memory tiers. Now, we have HMAT support in upstream. We >> will working on CDAT support. >> -- Best Regards, Huang, Ying