Hi Dragan, On Thu, Mar 30, 2023 at 05:03:24PM -0500, Dragan Stancevic wrote: > On 3/26/23 02:21, Mike Rapoport wrote: > > Hi, > > > > [..] >> One problem we experienced was occured in the combination of > hot-remove and kerelspace allocation usecases. > > > ZONE_NORMAL allows kernel context allocation, but it does not allow hot-remove because kernel resides all the time. > > > ZONE_MOVABLE allows hot-remove due to the page migration, but it only allows userspace allocation. > > > Alternatively, we allocated a kernel context out of ZONE_MOVABLE by adding GFP_MOVABLE flag. > > > In case, oops and system hang has occasionally occured because ZONE_MOVABLE can be swapped. > > > We resolved the issue using ZONE_EXMEM by allowing seletively choice of the two usecases. > > > As you well know, among heterogeneous DRAM devices, CXL DRAM is the first PCIe basis device, which allows hot-pluggability, different RAS, and extended connectivity. > > > So, we thought it could be a graceful approach adding a new zone and separately manage the new features. > > > > This still does not describe what are the use cases that require having > > kernel allocations on CXL.mem. > > > > I believe it's important to start with explanation *why* it is important to > > have kernel allocations on removable devices. > > Hi Mike, > > not speaking for Kyungsan here, but I am starting to tackle hypervisor > clustering and VM migration over cxl.mem [1]. > > And in my mind, at least one reason that I can think of having kernel > allocations from cxl.mem devices is where you have multiple VH connections > sharing the memory [2]. Where for example you have a user space application > stored in cxl.mem, and then you want the metadata about this > process/application that the kernel keeps on one hypervisor be "passed on" > to another hypervisor. So basically the same way processors in a single > hypervisors cooperate on memory, you extend that across processors that span > over physical hypervisors. If that makes sense... Let me reiterate to make sure I understand your example. If we focus on VM usecase, your suggestion is to store VM's memory and associated KVM structures on a CXL.mem device shared by several nodes. Even putting aside the aspect of keeping KVM structures on presumably slower memory, what ZONE_EXMEM will provide that cannot be accomplished with having the cxl memory in a memoryless node and using that node to allocate VM metadata? > [1] A high-level explanation is at http://nil-migration.org > [2] Compute Express Link Specification r3.0, v1.0 8/1/22, Page 51, figure > 1-4, black color scheme circle(3) and bars. > > -- > Peace can only come as a natural consequence > of universal enlightenment -Dr. Nikola Tesla > -- Sincerely yours, Mike.