RE(4): FW: [LSF/MM/BPF TOPIC] SMDK inspired MM changes for CXL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>Kyungsan Kim <ks0204.kim@xxxxxxxxxxx> writes:
>
>> I appreciate dan for the careful advice.
>>
>>>Kyungsan Kim wrote:
>>>[..]
>>>> >In addition to CXL memory, we may have other kind of memory in the
>>>> >system, for example, HBM (High Bandwidth Memory), memory in FPGA card,
>>>> >memory in GPU card, etc.  I guess that we need to consider them
>>>> >together.  Do we need to add one zone type for each kind of memory?
>>>> 
>>>> We also don't think a new zone is needed for every single memory
>>>> device.  Our viewpoint is the sole ZONE_NORMAL becomes not enough to
>>>> manage multiple volatile memory devices due to the increased device
>>>> types.  Including CXL DRAM, we think the ZONE_EXMEM can be used to
>>>> represent extended volatile memories that have different HW
>>>> characteristics.
>>>
>>>Some advice for the LSF/MM discussion, the rationale will need to be
>>>more than "we think the ZONE_EXMEM can be used to represent extended
>>>volatile memories that have different HW characteristics". It needs to
>>>be along the lines of "yes, to date Linux has been able to describe DDR
>>>with NUMA effects, PMEM with high write overhead, and HBM with improved
>>>bandwidth not necessarily latency, all without adding a new ZONE, but a
>>>new ZONE is absolutely required now to enable use case FOO, or address
>>>unfixable NUMA problem BAR." Without FOO and BAR to discuss the code
>>>maintainability concern of "fewer degress of freedom in the ZONE
>>>dimension" starts to dominate.
>>
>> One problem we experienced was occured in the combination of hot-remove and kerelspace allocation usecases.
>> ZONE_NORMAL allows kernel context allocation, but it does not allow hot-remove because kernel resides all the time.
>> ZONE_MOVABLE allows hot-remove due to the page migration, but it only allows userspace allocation.
>> Alternatively, we allocated a kernel context out of ZONE_MOVABLE by adding GFP_MOVABLE flag.
>> In case, oops and system hang has occasionally occured because ZONE_MOVABLE can be swapped.
>> We resolved the issue using ZONE_EXMEM by allowing seletively choice of the two usecases.
>
>Sorry, I don't get your idea.  You want the memory range
>
> 1. can be hot-removed
> 2. allow kernel context allocation
>
>This appears impossible for me.  Why cannot you just use ZONE_MOVABLE?

Indeed, we tried the approach. It was able to allocate a kernel context from ZONE_MOVABLE using GFP_MOVABLE.
However, we think it would be a bad practice for the 2 reasons.
1. It causes oops and system hang occasionally due to kernel page migration while swap or compaction. 
2. Literally, the design intention of ZONE_MOVABLE is to a page movable. So, we thought allocating a kernel context from the zone hurts the intention.

Allocating a kernel context out of ZONE_EXMEM is unmovable.
  a kernel context -  alloc_pages(GFP_EXMEM,)
Allocating a user context out of ZONE_EXMEM is movable.
  a user context - mmap(,,MAP_EXMEM,) - syscall - alloc_pages(GFP_EXMEM | GFP_MOVABLE,)
This is how ZONE_EXMEM supports the two cases.

>
>Best Regards,
>Huang, Ying
>
>> As you well know, among heterogeneous DRAM devices, CXL DRAM is the first PCIe basis device, which allows hot-pluggability, different RAS, and extended connectivity.
>> So, we thought it could be a graceful approach adding a new zone and separately manage the new features.
>>
>> Kindly let me know any advice or comment on our thoughts.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux