Thank you Mike Rapoport for participating discussion and adding your thought. >Hi, > >On Thu, Mar 23, 2023 at 07:51:05PM +0900, Kyungsan Kim wrote: >> I appreciate dan for the careful advice. >> >> >Kyungsan Kim wrote: >> >[..] >> >> >In addition to CXL memory, we may have other kind of memory in the >> >> >system, for example, HBM (High Bandwidth Memory), memory in FPGA card, >> >> >memory in GPU card, etc. I guess that we need to consider them >> >> >together. Do we need to add one zone type for each kind of memory? >> >> >> >> We also don't think a new zone is needed for every single memory >> >> device. Our viewpoint is the sole ZONE_NORMAL becomes not enough to >> >> manage multiple volatile memory devices due to the increased device >> >> types. Including CXL DRAM, we think the ZONE_EXMEM can be used to >> >> represent extended volatile memories that have different HW >> >> characteristics. >> > >> >Some advice for the LSF/MM discussion, the rationale will need to be >> >more than "we think the ZONE_EXMEM can be used to represent extended >> >volatile memories that have different HW characteristics". It needs to >> >be along the lines of "yes, to date Linux has been able to describe DDR >> >with NUMA effects, PMEM with high write overhead, and HBM with improved >> >bandwidth not necessarily latency, all without adding a new ZONE, but a >> >new ZONE is absolutely required now to enable use case FOO, or address >> >unfixable NUMA problem BAR." Without FOO and BAR to discuss the code >> >maintainability concern of "fewer degress of freedom in the ZONE >> >dimension" starts to dominate. >> >> One problem we experienced was occured in the combination of hot-remove and kerelspace allocation usecases. >> ZONE_NORMAL allows kernel context allocation, but it does not allow hot-remove because kernel resides all the time. >> ZONE_MOVABLE allows hot-remove due to the page migration, but it only allows userspace allocation. >> Alternatively, we allocated a kernel context out of ZONE_MOVABLE by adding GFP_MOVABLE flag. >> In case, oops and system hang has occasionally occured because ZONE_MOVABLE can be swapped. >> We resolved the issue using ZONE_EXMEM by allowing seletively choice of the two usecases. >> As you well know, among heterogeneous DRAM devices, CXL DRAM is the first PCIe basis device, which allows hot-pluggability, different RAS, and extended connectivity. >> So, we thought it could be a graceful approach adding a new zone and separately manage the new features. > >This still does not describe what are the use cases that require having >kernel allocations on CXL.mem. > >I believe it's important to start with explanation *why* it is important to >have kernel allocations on removable devices. > In general, a memory system with DDR/CXL DRAM will have near/far memory. And, we think kernel already includes memory tiering solutions - Meta TPP, zswap, and pagecache. Some kernel contexts would prefer fast memory. For example, a hot data with time locality or a data for fast processing such as metadata or indexing. Others would enough with slow memory. For example, a zswap page which is being used while swapping. >> Kindly let me know any advice or comment on our thoughts. >> >> > >-- >Sincerely yours, >Mike.